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(54) Title: EXPRESSION OF FRUCTOSE 1,6 BISPHOSPHATE ALDOLASE IN TRANSGENIC PLANTS 
(57) Abstract 

Fructose- 1,6-bisphosphate aldolase (FDA) is an enzyme reversibly catalyzing the reaction converting triosephosphate into 
fructose- 1,6-bisphosphate. In the leaf, this enzyme is located in the chloroplast (starch synthesis) and the cytosol (sucrose biosynthesis). 
Transgenic plants were generated that expess the E. coli fda gene in the chloroplast to improve plant yield by increasing leaf starch 
biosynthetic ability in particular and sucrose production in general. Leaves from plants expressing fhe/<fa transgene showed a significantly 
higher starch accumulation, as compared to control plants expressing the null vector, particularly early in the photoperiod, but had lower leaf 
sucrose. Transgenic plants also had a significantly higher root mass. Furthermore, transgenic potatoes expressing fda exhibited improved 
uniformity of solids. 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



Azerbaijan 

Bosnia and Herzegovina 
Barbados 



Czech Republic 

Germany 

Denmark 



United Kingdom 

Ghana 
Guinea 

Hungary 



Kenya 
Kyrgyzstan 
- Democratic People's 
Republic of Korea 
Republic of Korea 
Kazakstan 



Lesotho 
Lithuania 
Luxembourg 

Monaco 

Republic of Moldova 

Madagascar 

The former Yugoslav 

Republic of Macedonia 



Portugal 
Romania 

Russian Federation 



Togo 
Tajikistan 
Turkmenistan 
Turkey 

Trinidad and Tobaj 
Ukraine 

United States of Ai 
Uzbekistan 
Viet Nam 
Yugoslavia 
Zimbabwe 



WO 98/58069 



PCT/US98/12447 



EXPRESSION OF FRUCTOSE 1.6 BISPHOSPHATE 

ALDOLASE IN TRANSGENIC PLANTS 

This invention relates to the expression of fructose 1,6 bisphosphate aldolase 
(FDA) in transgenic plants to increase or improve plant growth and development, yield, 
5 vigor, stress tolerance, carbon allocation and storage into various storage pools, and 
distribution of starch. Transgenic plants expressing FDA have increased carbon 
assimilation, export and storage in plant source and sink organs, which results in growth, 
yield and quality improvements in crop plants. 

Recent advances in genetic engineering have provided the prerequisite tools to 
10 transform plants to contain alien (often referred to as "heterologous") or improved 

endogenous genes. These genes can lead either to an improvement of an already existing 
pathway in plant tissues or to an introduction of a novel pathway to modify product levels, 
increase metabolic efficiency, and or save on energy cost to the cell. It is presently 
possible to produce plants with unique physiological and biochemical traits and 
15 characteristics of high agronomic and crop processing importance. Traits that play an 

essential role in plant growth and development, crop yield potential and stability, and crop 
quality and composition include enhanced carbon assimilation, efficient carbon storage, 
and increased carbon export and partitioning. 

Atmospheric carbon fixation (photosynthesis) by plants represents the major source 
20 of energy to support processes in all living organisms. The primary sites of photosynthetic 
activity, generally referred to as "source organs", are mature leaves and, to a lesser extent, 
green stems. The major carbon products of source leaves are starch, which represents the 
transitory storage form of carbohydrate in the chloroplast, and sucrose, which represents 
the predominant form of carbon transport in higher plants. Other plant parts named "sink 
25 organs" (e.g., roots, fruit, flowers, seeds, tubers, and bulbs) are generally not autotrophic 
. and depend on import of sucrose or other major translocatable carbohydrates for their 
growth and development. The storage sinks deposit the imported metabolites as sucrose 
and other oligosaccharides, starch and other polysaccharides, proteins, and triglycerides. 
In leaves, the primary products of the Calvin Cycle (the biochemical pathway 
30 leading to carbon assimilation) are glyceraldehyde 3-phosphate (G3P) and 

dihydroxyacetone phosphate (DHAP), also known as triose phosphates (triose-P). The 
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condensation of G3P and DHAP into fructose 1,6 bisphosphate (FBP) is catalyzed 
reversibly by the enzyme fructose 1,6 bisphosphate aldolase (FDA), and various isozymes 
are known. The acidic isoenzyme appears to be chloroplastic and comprises about 85% of 
the total leaf aldolase activity. The basic isoenzyme is cytosolic. Both isoenzymes appear 
5 to be encoded by the nuclear genome and are encoded by different genes (Lebherz et al., 
1984). 

In the leaf, the chloroplast FDA is an essential enzyme in the Calvin Cycle, where 
its activity generates metabolites for starch biosynthesis. Removal of more than 40% of 
the plastidic aldolase enzymatic activity by antisense technology reduced leaf starch 

10 accumulation as well as soluble proteins and chlorophyll levels but also reduced plant 

growth and root formation (Sonnewald et al., 1994). In contrast, the cytosolic FDA is part 
of the sucrose biosynthetic pathway where it catalyzes the reaction of FBP production. 
Moreover, cytosolic FDA is also a key enzyme in the glycolytic and gluconeogenesis 
pathways in both source and sink plant tissues. 

15 In the potato industry, production of higher starch and uniform solids tubers is 

highly desirable and valuable. The current potato varieties that are used for french fry 
production, such as Russet Burbank and Shepody, suffer from a non-uniform deposition of 
solids between the tuber pith (inner core) and the cortex (outer core). French fry strips that 
are taken from pith tissue are higher in water content when compared to outer cortex 

20 french fry strips; cortex tissue typically displays a solids level of twenty-four percent 

whereas pith tissue typically displays a solids level of seventeen percent. Consequently, in 
the french fry production process, the pith strips need to be blanched, dried, and par-fried 
for longer times to eliminate the excess water. Adequate processing of the pith fries 
results in the over-cooking of fries from the high solids cortex. The blanching, drying, and 

25 par frying times of the french fry processor need to be adjusted accordingly to 

accommodate the low solids pith strips and the high solids cortex strips. A higher solids 
potato with a more uniform distribution of starch from pith to cortex would allow for a 
more uniform finished fry product, with higher plant throughput and cost savings due to 
reduced blanch, dry and par-fry times. 

30 Although various fructose 1,6 bisphosphate aldolases have been previously 

characterized, it has been discovered that overexpression of the enzyme in. a transgenic 
plant provides advantageous results in the plant such as increasing the assimilation, export 
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and storage of carbon, increasing the production of oils and/or proteins in the plant and 
improving tuber solids uniformity. 

The present invention provides structural DNA constructs that encode a fructose 
1,6 bisphosphate aldolase (FDA) enzyme and that are useful in increasing carbon 
5 assimilation, export, and storage in plants. 

In accomplishing the foregoing, there is provided, in accordance with one aspect of 
the present invention, a method of producing genetically transformed plants that have 
elevated carbon assimilation, storage, export, and improved solids uniformity comprising 
the steps of: 

10 (a) Inserting into the genome of a plant a recombinant, double-stranded DNA 

molecule comprising 

(i) a promoter that functions in the cells of a target plant tissue, 

(ii) a structural DNA sequence that causes the production of an RNA sequence 
that encodes a fructose 1 ,6 bisphosphate aldolase enzyme, 

15 (iii) a 3' non-translated DNA sequence that functions in plant cells to cause 

transcriptional termination and the addition of polyadenylated nucleotides 
to the 3' end of the RNA sequence; 

(b) obtaining transformed plant cells; and 

(c) regenerating from transformed plant cells genetically transformed plants that 
20 have elevated FDA activity. 

In another aspect of the present invention there is provided a recombinant, double- 
stranded DNA molecule comprising in sequence 

(i) a promoter that functions in the cells of a target plant tissue, 

(ii) a structural DNA sequence that causes the production of an RNA sequence 
25 that encodes a fructose 1 ,6 bisphosphate aldolase enzyme, 

(iii) a 3' non-translated DNA sequence that functions in plant cells to cause 

transcriptional termination and the addition of polyadenylated nucleotides to 
the 3' end of the RNA sequence. 
In a further aspect of the present invention, the structural DNA sequence that 
30 causes the production of an RNA sequence that encodes a fructose 1 ,6 bisphosphate 

aldolase enzyme is coupled with a chloroplast transit peptide to facilitate transport of the 
enzyme to the plastid. 
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In accordance with the present invention, an improved means for increasing carbon 
assimilation, storage and export in the source tissues of various plants is provided. Further 
means of improved carbon accumulation in sinks (such as roots, tubers, seeds, stems, and 
bulbs) are provided, thus increasing the size of various sinks (larger roots, tubers, etc.) and 
5 subsequently increasing yield and crop productivity. The increased carbon availability to 
these sinks would also improve composition and use efficiency in the sink (oil, protein, 
starch and/or sucrose production, and/or solids uniformity). 

Various advantages may be achieved by the aims of the present invention, 
including: 

10 First, increasing the expression of the FDA enzyme in the chloroplast would 

increase the flow of carbon through the Calvin Cycle and increase atmospheric carbon 
assimilation during early photoperiod. This would result in an increase in photosynthetic 
efficiency and an increase in chloroplast starch production (a leaf carbon storage form 
degraded during periods when photosynthesis is low or absent). Both of these responses 

15 would lead to an increase in sucrose production by the leaf and a net increase in carbon 
export during a given photoperiod. This increase in source capacity is a desirable trait in 
crop plants and would lead to increased plant growth, storage ability, yield, vigor, and 
stress tolerance. 

Second, increasing FDA expression in the cytosol of photosynthetic cells would 
20 lead to an increase in sucrose production and export out of source leaves. This increase in 
source capacity is a desirable trait in crop plants and would lead to increased plant growth, 
storage ability, yield, vigor, and stress tolerance. 

Third, expression of FDA in sink tissues can show several desirable traits, such as 
increased amino acid and/or fatty acid pools via increases in carbon flux through 
25 glycolysis (and thus pyruvate levels) in seeds or other sinks and increased starch levels as 
result of increased production of glucose 6-phosphate in seeds, roots, stems, and tubers 
. where starch is a major storage nonstructural carbohydrate (reverse glycolysis). This 
increase in sink strength is a desirable trait in crop plants and would lead to increased plant 
growth, storage ability, yield, vigor, and stress tolerance. 
30 Fourth, the invention is particularly desirable for use in the commercial production 

of foods derived from potatoes. Potatoes used for the production of french fries and other 
products suffer from a non-uniform distribution of solids between the tuber pith (inner 
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core) and the cortex (outer core). Thus, french fry strips from the pith regions of such 
tubers have a low solids content and a high water content in comparison to cortex strips 
from the same tubers. Therefore, the french fry processor attempts to adjust the processing 
parameters so that the final inner strips are sufficiently cooked while the outer cortex strips 
5 are not overcooked. The results of such adjustments, however, are highly variable and 
may lead to poor quality product. Transgenic potatoes expressing fda will provide to the 
french-fry and potato chip processor a raw product that consistently displays a higher tuber 
solids uniformity with acceptable agronomic traits. In the french fry plant production 
process, inner pith fry strips from higher solids uniformity tubers will require less time to 

10 blanch, less time to dry to a specific solids content, and less time to par-fry before freezing 
and shipping to retail and institutional end-users. 

Therefore, with respect to potatoes, the present invention provides 1 ) a higher 
quality, more uniform finish fry product in which french fries from all tuber regions, when 
processed, are nearly the same, 2) a higher through-put in the french fry processing plant 

15 due to lower processing times, and 3) processor cost savings due to lower energy input 
required for lower blanch, dry, and par-fry times. A raw tuber product that displays a 
higher solids uniformity will also produce a potato chip that has a reduced saddle curl, and 
a reduced tendency for center bubble, which are undesirable qualities in the potato chip 
industry. Reduced fat content would also result; this would contribute to improved 

20 consumer appeal and lower oil use (and costs) for the processor. The increase in solids 
uniformity will also translate to an increase in overall tuber solids. For both the french fry 
and chipping industries, this overall tuber solids increase will also result in higher through- 
put in the processing plant due to lower processing times, and cost savings due to lower 
energy input for blanching, drying, par-frying, and finish frying. 

25 Figure 1 shows the nucleotide sequence and deduced amino acid sequence of a fructose 
1,6 bisphosphate aldolase gene from E. coli (SEQ ID No:l). 

Figure 2 shows a plasmid map for plant transformation vector pMON 17524. 

30 Figure 3 shows a plasmid map for plant transformation vector pMON 1 7542. 
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Figure 4 shows the change in diurnal fluctuations of sucrose, glucose, and starch levels in 
tobacco leaves expressing the fda transgene (pMON17524) and control (pMON 17227). 
The light period is from 7:00 to 19:00 hours. Only fully expanded and non-senescing 
leaves were sampled. 

5 

Figure 5 shows a plasmid map for plant transformation vector pMON 13925. 

Figure 6 shows a plasmid map for plant transformation vector pMON17590. 

10 Figure 7 shows a plasmid map for plant transformation vector pMON13936. 

Figure 8 shows a plasmid map for plant transformation vector pMON 17581. 

Figure 9 shows potato tuber cross-sections of improved solids uniformity Segal Russet 
15 Burbank lines (top row) versus unimproved nontransgenic Russet Burbank (bottom row). 

This invention is directed to a method for producing plant cells and plants 
demonstrating an increased or improved growth and development, yield, quality, starch 
storage uniformity, vigor, and/or stress tolerance. The method utilizes a DNA sequence 

20 encoding an fda (fructose 1 ,6 bisphosphate aldolase) gene integrated in the cellular 

genome of a plant as the result of genetic engineering and causes expression of the FDA 
enzyme in the transgenic plant so produced. Plants that overexpress the FDA enzyme 
exhibit increased carbon flow through the Calvin Cycle and increased atmospheric carbon 
assimilation during early photoperiod resulting in an increase in photosynthetic efficiency 

25 and an increase in starch production. Thus, such plants exhibit higher levels of sucrose 
production by the leaf and the ability to achieve a net increase in carbon export during a 
. given photoperiod. This increase in source capacity leads to increased plant growth that in 
turn generates greater biomass and/or increases the size of the sink and ultimately 
providing greater yields of the transgenic plant. This greater biomass or increased sink 

30 size may be evidenced in different ways or plant parts depending on the particular plant 
species or growing conditions of the plant overexpressing the FDA enzypie. Thus, 
increased size resulting from overexpression of FDA may be seen in the seed, fruit, stem, 
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leaf, tuber, bulb or other plant part depending upon the plant species and its dominant sink 
during a particular growth phase and upon the environmental effects caused by certain 
growing conditions, e.g. drought, temperature or other stresses. Transgenic plants 
overexpressing FDA may therefore have increased carbon assimilation, export and storage 
5 in plant source and sink organs, which results in growth, yield, and uniformity and quality 
improvements. 

Plants overexpressing FDA may also exhibit desirable quality traits such as 
increased production of starch, oils and/or proteins depending upon the plant species 
overexpressing the FDA. Thus, overexpression of FDA in a particular plant species may 

10 affect or alter the direction of the carbon flux thereby directing metabolite utilization and 
storage either to starch production, protein production or oil production via the role of 
FDA in the glycolysis and gluconeogenesis metabolic pathways. 

The mechanism whereby the expression of exogenous FDA modifies carbon 
relationships is believed to derive from source-sink relationships. The leaf tissue is a 

15 sucrose source, and if more sucrose resulting from the activity of increased FDA 

expression is transported to a sink, it results in increased storage carbon (sugars, starch, 
oil, protein, etc.) or nitrogen (protein, etc.) per given weight of the sink tissue. 

The expression in a plant of a gene that exists in double-stranded DNA form 
involves transcription of messenger RNA (mRNA) from one strand of the DNA by RNA 

20 polymerase enzyme, and the subsequent processing of the mRNA primary transcript inside 
the nucleus. This processing involves a 3' non-translated region, which adds 
polyadenylate nucleotides to the 3' end of the RNA. Transcription of DNA into mRNA is 
regulated by a region of DNA usually referred to as the promoter. The promoter region 
contains a sequence of bases that signals RNA polymerase to associate with the DNA and 

25 to initiate the transcription of mRNA using one of the DNA strands as a template to make 
a corresponding complimentary strand of RNA. This RNA is then used as a template for 
. the production of the protein encoded therein by the cells protein biosynthetic machinery. 
A number of promoters that are active in plant cells have been described in the 
literature. These include the nopaline synthase (NOS) and octopine synthase (OCS) 

30 promoters (which are carried on tumor-inducing plasmids of Agrobacterium tumefaciens), 
the caulimovirus promoters such as the cauliflower mosaic virus (CaMV) 19S and 35S and 
the figwort mosaic virus (FMV) 35S-promoters, the light-inducible promoter from the 
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small subunit of ribulose-l,5-bisphosphate carboxylase (ssRUBISCO), a very abundant 
plant polypeptide, and the chlorophyll a/b binding protein gene promoters, etc. All of 
these promoters have been used to create various types of DNA constructs that have been 
expressed in plants; see, e.g., PCT publication WO 84/02913. 
5 Promoters that are known to or are found to cause transcription of DNA in plant 

cells can be used in the present invention. Such promoters may be obtained from a variety 
of sources such as plants and plant viruses and include, but are not limited to, the enhanced 
CaMV35S promoter and promoters isolated from plant genes such as ssRUBISCO genes. 
As described below, it is preferred that the particular promoter selected should be capable 

10 of causing sufficient expression to result in the production of an effective amount of 
fructose 1 ,6 bisphosphate aldolase enzyme to cause the desired increase in carbon 
assimilation, export or storage. Expression of the double-stranded DNA molecules of the 
present invention can be driven by a constitutive promoter, expressing the DNA molecule 
in all or most of the tissues of the plant. Alternatively, it may be preferred to cause 

15 expression of the /da gene in specific tissues of the plant, such as leaf, stem, root, tuber, 
seed, fruit, etc. The promoter chosen will have the desired tissue and developmental 
specificity. Those skilled in the art will recognize that the amount of fructose 1,6 
bisphosphate aldolase needed to induce the desired increase in carbon assimilation, export, 
or storage may vary with the type of plant. Therefore, promoter function should be 

20 optimized by selecting a promoter with the desired tissue expression capabilities and 
approximate promoter strength and selecting a transformant that produces the desired 
fructose 1,6 bisphosphate aldolase activity or the desired change in metabolism of 
carbohydrates in the target tissues. This selection approach from the pool of transformants 
is routinely employed in expression of heterologous structural genes in plants because 

25 there is variation between transformants containing the same heterologous gene due to the 
site of gene insertion within the plant genome (commonly referred to as "position effect"). 
In addition to promoters that are known to cause transcription (constitutively or tissue- 
specific) of DNA in plant cells, other promoters may be identified for use in the current 
invention by screening a plant cDNA library for genes that are selectively or preferably 

30 expressed in the target tissues of interest and then isolating the promoter regions by 
methods known in the art. In particular, it may be desirable to use a bundle sheath cell 
specific (or cell enhanced expression) promoter for use with C4 plants such as corn, 
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sorghum, and sugarcane to obtain the yield benefits of overexpression of FDA and not use 
a constitutive promoter or a promoter with mesophyll cell enhanced expression properties. 

For the purpose of expressing the fda gene in source tissues of the plant, such as 
the leaf or stem, it is preferred that the promoters utilized in the double-stranded DNA 
5 molecules of the present invention have relatively high expression in these specific tissues. 
For this purpose, one may also choose from a number of promoters for genes with leaf- 
specific or leaf-enhanced expression. Examples of such genes known from the literature 
are the chloroplast glutamine synthetase GS2 from pea (Edwards et al., 1990), the 
chloroplast fructose- 1 ,6-bisphosphatase (FBPase) from wheat (Lloyd et al., 1991), the 

10 nuclear photosynthetic ST-LS 1 from potato (Stockhaus et al., 1 989), and the phenylalanine 
ammonia-lyase (PAL) and chalcone synthase (CHS) genes from Arabidopsis thaliana 
(Leyva et al., 1995). Also shown to be active in photosynthetically active tissues are the 
ribulose-l,5-bisphosphate carboxylase (RUBISCO), isolated from eastern larch (Larix 
laricina) (Campbell et al., 1994); the cab gene, encoding the chlorophyll a/b-binding 

15 protein of PSII, isolated from pine (cab6; Yamamoto et al., 1994), wheat (Cab-1; Fejes et 
al., 1990), spinach (CAB-1; Luebberstedt et al., 1994), and rice (cablR: Luan et al., 1992); 
the pyruvate orthophosphate dikinase (PPDK) from maize (Matsuoka et al., 1993); the 
tobacco Lhcbl*2 gene (Cerdan et al., 1997); the Arabidopsis thaliana SUC2 sucrose-H+ 
symporter gene (Truernit et al., 1995); and the thylacoid membrane proteins, isolated from 

20 spinach (psaD, psaF, psaE, PC, FNR, atpC, atpD, cab, rbcS; Oelmueller et al., 1 992). 
Other chlorophyll a/b-binding proteins have been studied and described in the literature, 
such as LhcB and PsbP from white mustard (Sinapis alba; Kretsch et al., 1995). 
Homologous promoters to those described here may also be isolated from and tested in the 
target or related crop plant by standard molecular biology procedures. 

25 For the purpose of expressing the fda in sink tissues of the plant, for example the 

tuber of the potato plant; the fruit of tomato; or seed of maize, wheat, rice, or barley, it is 
preferred that the promoters utilized in the double-stranded DNA molecules of the present 
invention have relatively high expression in these specific tissues. A number of genes with 
tuber-specific or tuber-enhanced expression are known, including the class I patatin 

30 promoter (Bevan et al., 1986; Jefferson et al., 1990); the potato tuber ADPGPP genes, both 
the large and small subunits (Muller et al., 1990); sucrose synthase (Salanpubat and 
Belliard, 1987, 1989); the major tuber proteins including the 22 kDa protein complexes 
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and proteinase inhibitors (Hannapel, 1990); the granule bound starch synthase gene 
(GBSS) (Rohde et al., 1990); and the other class I and II patatins (Rocha-Sosa et al., 1989; 
Mignery et al., 1988). Other promoters can also be used to express a fructose 1,6 
bisphosphate aldolase gene in specific tissues, such as seeds or fruits. The promoter for 13- 

5 conglycinin (Tierney, 1987) or other seed-specific promoters, such as the napin and 
phaseolin promoters, can be used to over-express an fda gene specifically in seeds. The 
zeins are a group of storage proteins found in maize endosperm. Genomic clones for zein 
genes have been isolated (Pedersen et al., 1982), and the promoters from these clones, 
including the 15 kDa, 16 kDa, 19 kDa, 22 kDa, 27 kDa, and gamma genes, could also be 

10 used to express an fda gene in the seeds of maize and other plants. Other promoters 

known to function in maize, wheat, or rice include the promoters for the following genes: 
waxy, Brittle, Shrunken 2, branching enzymes I and II, starch synthases, debranching 
enzymes, oleosins, glutelins. and sucrose synthases. Particularly preferred promoters for 
maize endosperm expression, as well as in wheat and rice, of an fda gene is the promoter 

15 for a glutelin gene from rice, more particularly the Osgt-1 promoter (Zheng et al., 1993); 
the maize granule-bound starch synthase (waxy) gene (zmGBS); the rice small subunit 
ADPGPP promoter (osAGP) ;and the zein promoters, particularly the maize 27 kDa zein 
gene promoter (zm27) (see, generally, Russell et al., 1997). Examples of promoters 
suitable for expression of an fda gene in wheat include those for the genes for the 

20 ADPglucose pyrophosphorylase (ADPGPP) subunits, for the granule bound and other 
starch synthases, for the branching and debranching enzymes, for the embryogenesis- 
abundant proteins, for the gliadins, and for the glutenins. Examples of such promoters in 
rice include those for the genes for the ADPGPP subunits, for the granule bound and other 
starch synthases, for the branching enzymes, for the debranching enzymes, for sucrose 

25 synthases, and for the glutelins. A particularly preferred promoter is the promoter for rice 
glutelin, Osgt-1. Examples of such promoters for barley include those for the genes for the 
. ADPGPP subunits, for the granule bound and other starch synthases, for the branching 
enzymes, for the debranching enzymes, for sucrose synthases, for the hordeins, for the 
embryo globulins, and for the aleurone-specific proteins. 

30 The solids content of root tissue may be increased by expressing an fda gene 

behind a root-specific promoter. An example of such a promoter is the promoter from the 
acid chitinase gene (Samac et al., 1990). Expression in root tissue could also be 
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accomplished by utilizing the root-specific subdomains of the CaMV35S promoter that 
have been identified (Benfey et al., 1989). 

The RNA produced by a DNA construct of the present invention may also contain 
a 5' non-translated leader sequence. This sequence can be derived from the promoter 
5 selected to express the gene and can be specifically modified so as to increase translation 
of the mRNA. The 5' non-translated regions can also be obtained from viral RNAs, from 
suitable eukaryotic genes, or from a synthetic gene sequence. The present invention is not 
limited to constructs, as presented in the following examples, wherein the non-translated 
region is derived from the 5' non-translated sequence that accompanies the promoter 
10 sequence. Rather, the non-translated leader sequence can be derived from an unrelated 
promoter or coding sequence. 

In monocots, an intron is preferably included in the gene construct to facilitate or 
enhance expression of the coding sequence. Examples of suitable introns include the 
HSP70 intron and the rice actin intron, both of which are known in the art. Another 
15 suitable intron is the castor bean catalase intron (Suzuki et al., 1994) 
Polvadenvlation signal 

The 3' non-translated region of the chimeric plant gene contains a polyadenylation 
signal that functions in plants to cause the addition of polyadenylate nucleotides to the 
3' end of the RNA. Examples of suitable 3' regions are (1) the 3' transcribed, non- 
20 translated regions containing the polyadenylation signal of Agrobacterium tumor-inducing 
(Ti) plasmid genes, such as the nopaline synthase (NOS) gene, and (2) plant genes like the 
soybean storage protein genes and the small subunit of the ribulose-l,5-bisphosphate 
carboxylase (ssRUBISCO) gene. 

Plastid-directed expression of fructose- 1.6-bisphosphate aldolase activity 
25 In one embodiment of the invention, the fda gene may be fused to a chloroplast 

transit peptide, in order to target the FDA protein to the plastid. As used hereinafter, 
. chloroplast and plastid are intended to include the various forms of plastids including 
amyloplasts. Many plastid-localized proteins are expressed from nuclear genes as 
precursors and are targeted to the plastid by a chloroplast transit peptide (CTP), which is 
30 removed during the import steps. Examples of such chloroplast proteins include the small 
subunit of ribulose-l,5-biphosphate carboxylase (ssRUBISCO, SSU), 5- 
enolpyruvateshikimate-3 -phosphate synthase (EPSPS), ferredoxin, ferredoxin 
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oxidoreductase, the light-harvesting-complex protein I and protein II, and thioredoxin F. It 
has been demonstrated that non-plastid proteins may be targeted to the chloroplast by use 
of protein fusions with a CTP and that a CTP sequence is sufficient to target a protein to 
the plastid. Those skilled in the art will also recognize that various other chimeric 

5 constructs can be made that utilize the functionality of a particular plastid transit peptide to 
import the fructose- 1 ,6-diphosphate aldolase enzyme into the plant cell plastid. The fda 
gene could also be targeted to the plastid by transformation of the gene into the chloroplast 
genome (Daniell et al., 1998). 
Fructose 1 ,6 bisphosphate aldolases 

10 As used herein, the term "fructose 1, 6-bisphophate aldolase" means an enzyme 

(E.C. 4.1.2.13) that catalyzes the reversible cleavage of fructose 1 ,6-bisphosphate to form 
glyceraldehyde 3-phosphate (G3P) and dihydroxyacetone phosphate (DHAP). Aldolase 
enzymes are divided into two classes, designated class I and class II ( Witke and Gotz, 
1993). Various fda genes encoding the enzyme have been sequenced, as have numerous 

15 proteins, such as the cytosolic enzyme from maize (GenBank Accession S07789;S10638), 
cytosolic enzyme from rice (GenBank Accession JQ0543), cytosolic enzyme from spinach 
(GenBank Accession S3 1 09 1 ;S22093), from Arabidopsis thaliana (GenBank Accession 
SI 1958), from spinach chloroplast (GenBank Accession S31090;A21815;S22092), from 
yeast (S. cerevisiae) (GenBank Accession S07855; S37882; S12945; S39178; 

20 S44523 ;X7578 1 ), from Rhodobacter sphaeroides (GenBank Accession B40767;D4 1080), 
from B. subtilis (GenBank Accession S55426; D32354: E32354: D41835), from garden 
pea (GenBank Accession S29048; S3441 1), from garden pea chloroplast (GenBank 
Accession S29047; S34410), from maize (GenBank Accession S05019), from 
Chlamydomonas reinhardtii (GenBank Accession S48639; S58485; S58486; S34367), 

25 from Corynebacterium glutamicum (GenBank Accession S09283; X173 13), from 

Campylobacter jejuni (GenBank Accession S52413), from Haemophilus influenzae (strain 
. Rd KW20) (GenBank Accession C64074), from Streptococcus pneumonia (GenBank 
Accession AJ005697), from rice (GenBank Accession X53130), and from the maize 
anaerobically regulated gene (GenBank Accession XI 2872). 

30 The class I enzymes may be isolated from higher eukaryotes, such as animals and 

plants, and in some prokaryotes, including Peptococcus aerogens, (Lebherz and Rutter, 
1973), Lactobacillus casei (London and Kline, 1973), Escherichia coli (Stribling and 
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Perham, 1973), Mycobacterium smegmatis (Bai et al., 1975), and most staphylococcal 
species (Gotz et al., 1979). The gene for the FDA enzyme may be obtained by known 
methods and has already been done so for several organisms, such as rabbit (Lai et al., 
1974), human (Besmond et al., 1983), rat (Tsutsumi et al., 1984), Trypanosoma brucei 
5 (Clayton, 1985), and Arabidopsis thaliana (Chopra et al., 1990). These class I enzymes 
are invariably tetrameric proteins with a total molecular weight of about 160 kDa and 
function by imine formation between the substrate and a lysine residue in the active site 
(Alfounder et al., 1989). 

In animal, three class I isozymes, classified as A, B, and C, are expressed in the 

10 cytosol of muscle, liver, and brain tissue respectively, and they differ from plant aldolases 
in their expression and compartmentation patterns (Joh et al., 1986). In the leaves of 
higher plants, FDA is a class I enzyme, and two different isoenzymes within the class have 
been documented. One is contained in the chloroplast and the other in the cytosol 
(Lebherz et al., 1984). The acidic plant isozyme appear to be chloroplastic and comprises 

15 about 85% of the total leaf aldolase activity. The basic plant isozyme is cytosolic, and 
both isozymes appear to be encoded by the nuclear genome and are encoded by different 
genes (Lebherz et al., 1984). 

The class II type aldolases are normally dimeric with molecular mass of 
approximately 80 kDa, and their activity depends on divalent metal ions. The class II 

20 enzymes may be isolated from prokaryotes, such as blue-green algae and bacteria, and 
eukaryotic green algae and fungi (Baldwin et al., 1978). The gene for the FDA class II 
enzyme may be obtained by known methods and has already been done so from several 
organisms including Saccharomyces cerevisiae (Jack and Harris, 1971), Bacillus 
stearothermophilus (Jack, 1973), and Escherichia coli (Baldwin et al., 1978). 

25 It is believed that highly homologous class II fructose 1 , 6-bisphophate aldolases 

with similar catalyzing activity will also be found in other species of microorganism, such 
. as Saccharomyces {Saccharomyces cerevisiae); Bacillus {Bacillus subtilis); Rhodobacter 
(Rhodobacter sphaeroides); Plasmodium {Plasmodium falciparium, Plasmodium berghei); 
Trypanosoma {Trypanosoma brucei); Chlamydomonas {Chlamydomas reinhardtii); 

30 Candida {Candida albicans); Corynebacterium {Corynebacterium glutamicum); 

Campylobacter {Campylobacter jejuni); and Haemophilus {Haemophilusjnfluenza). 
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Such sequences can be readily isolated by methods well known in the art, for 
example by nucleic acid hybridization. The hybridization properties of a given pair of 
nucleic acids are an indication of their similarity or identity. Nucleic acid sequences can 
be selected on the basis of their ability to hybridize with known fda sequences. Low 
5 stringency conditions may be used to select sequences with less homology or identity. 

One may wish to employ conditions such as about 0.15 M to about 0.9 M sodium chloride, 
at temperatures ranging from about 20°C to about 55°C. High stringency conditions may 
be used to select for nucleic acid sequences with higher degrees of identity to the disclosed 
sequences. Conditions typically employed may include about 0.02 M to about 0.15 M 

10 sodium chloride, about 0.5% to about 5% casein, about 0.02% SDS or about 0. 1% N- 
laurylsarcosine, about 0.001 M to about 0.03 M sodium citrate, at hybridization 
temperatures between about 50°C and about 70°C. More preferably, high stringency 
conditions are about 0.02 M sodium chloride, about 0.5% casein, about 0.02% SDS, about 
0.001 M sodium citrate, at a temperature of about 50°C. The skilled individual will 

15 recognize that numerous variations are possible in the conditions and means by which 
nucleic acid hybridization can be performed to isolate fda sequences having similarity to 
fda sequences known in the art and are not limited to those explicitly disclosed herein. 
Preferably, such an approach is used to isolate fda sequences having greater than about 
60% identity with the disclosed E.colifda sequence, more preferably greater than about 

20 70% identity, most preferably greater than about 80% identity. 

Depending on growth conditions Euglena gracilis, Chlamydomonas mundana, and 
Chlamydomomas rheinhardi produce either a class I or a class II aldolase (Cremona, 1968; 
Russell and Gibbs, 1967; Guerrini et al., 1971). 

The isolation of a class II fda gene from E. coli is described in the following 

25 examples. Its DNA sequence is given as SEQ ID NOT and shown in Figure 1 . The 

amino acid sequence is shown in SEQ ID NO:2 and shown in Figure 1 . This gene can be 
used as isolated by inserting it into plant expression vectors suitable for the transformation 
method of choice as described. The E. coli FDA enzyme has an apparent pH optimum 
range near pH 7-9 and retains activity in the lower pH range of 5-7 (Baldwin et al., 1978; 

30 Alfounder et al, 1989). 

Thus, many different genes that encode a fructose 1,6 bisphosphate aldolase 
activity may be isolated and used in the present invention. 
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Synthetic gene construction 

A carbohydrate metabolizing enzyme considered in this invention includes any 
sequence of amino acids, such as protein, polypeptide, or peptide fragment, that 
demonstrates the ability to catalyze a reaction involved in the synthesis or degradation of 

5 starch or sucrose. These can be sequences obtained from a heterologous source, such as 
algae, bacteria, fungi, and protozoa, or endogenous plant sequences, by which is meant 
any sequence that can be naturally found in a plant cell, including native (indigenous) 
plant sequences as well as sequences from plant viruses or plant pathogenic bacteria. 
It will be recognized by one of ordinary skill in the art that carbohydrate 

10 metabolizing enzyme gene sequences may also be modified using standard techniques 
such as site-specific mutation or PCR, or modification of the sequence may be 
accomplished by producing a synthetic nucleic acid sequence and will still be considered a 
carbohydrate biosynthesis enzyme nucleic acid sequence of this invention. For example, 
"wobble" positions in codons may be changed such that the nucleic acid sequence encodes 

15 the same amino acid sequence, or alternatively, codons can be altered such that 
conservative amino acid substitutions result. In either case, the peptide or protein 
maintains the desired enzymatic activity and is thus considered part of this invention. 

A nucleic acid sequence to a carbohydrate metabolizing enzyme may be a DNA or 
RNA sequence, derived from genomic DNA, cDNA, mRNA, or may be synthesized in 

20 whole or in part. The structural gene sequences may be cloned, for example, by isolating 
genomic DNA from an appropriate source and amplifying and cloning the sequence of 
interest using a polymerase chain reaction (PCR). Alternatively, the gene sequences may 
be synthesized, either completely or in part, especially where it is desirable to provide 
plant-preferred sequences. Thus, all or a portion of the desired structural gene may be 

25 synthesized using codons preferred by a selected plant host. Plant-preferred codons may 
be determined, "for example, from the codons used most frequently in the proteins 
. expressed in a particular plant host species. Other modifications of the gene sequences 
may result in mutants having slightly altered activity. 

If desired, the gene sequence of the fda gene can be changed without changing the 

30 protein sequence in such a manner as may increase expression and thus even more 
positively affect carbohydrate content in transformed plants. A preferred.manner for 
making the changes in the gene sequence is set out in PCT Publication WO 90/10076. A 
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gene synthesized by following the methodology set out therein may be introduced into 
plants as described below and result in higher levels of expression of the FDA enzyme. 
This may be particularly useful in monocots such as maize, rice, wheat, sugarcane, and 
barley. 

5 Combinations with other transgenes 

The effect of fda in transgenic plants may be enhanced by combining it with other 
genes that positively affect carbohydrate assimilation or content, such as a gene encoding 
for a sucrose phosphorylase as described in PCT Publication WO 96/24679, or ADPGPP 
genes such as the E. coli glgC gene and its mutant glgC\ 6. PCT Publication WO 

10 91/19806 discloses how to incorporate the latter gene into many plant species in order to 
increase starch or solids. Another gene that can be combined with fda to increase carbon 
assimilation, export or storage is a gene encoding for sucrose phosphate synthase (SPS). 
PCT Publication WO 92/16631 discloses one such gene and its use in transgenic plants. 
Plant transformation/regeneration 

15 In developing the nucleic acid constructs of this invention, the various components 

of the construct or fragments thereof will normally be inserted into a convenient cloning 
vector, e.g., a plasmid that is capable of replication in a bacterial host, e.g., E. coli. 
Numerous vectors exist that have been described in the literature, many of which are 
commercially available. After each cloning, the cloning vector with the desired insert may 

20 be isolated and subjected to further manipulation, such as restriction digestion, insertion of 
new fragments or nucleotides, ligation, deletion, mutation, resection, etc. so as to tailor the 
components of the desired sequence. Once the construct has been completed, it may then 
be transferred to an appropriate vector for further manipulation in accordance with the 
manner of transformation of the host cell. 

25 A recombinant DNA molecule of the invention typically includes a selectable 

marker so that transformed cells can be easily identified and selected from non- 
transformed cells. Examples of such include, but are not limited to, a neomycin 
phosphotransferase (nptll) gene (Potrykus et al., 1985), which confers kanamycin 
resistance. Cells expressing the nptll gene can be selected using an appropriate antibiotic 

30 such as kanamycin or G41 8. Other commonly used selectable markers include the bar 
gene, which confers bialaphos resistance; a mutant EPSP synthase gene (Flinchee et al., 
1988), which confers glyphosate resistance; a nitrilase gene, which confers resistance to 
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bromoxynil (Stalker et al., 1988); a mutant acetolactate synthase gene (ALS), which 
confers imidazolinone or sulphonylurea resistance (European Patent Application 154,204, 
1985); and a methotrexate resistant DHFR gene (Thillet et al., 1988). 

Plants that can be made to have enhanced carbon assimilation, increased carbon 
5 export and partitioning by practice of the present invention include, but are not limited to, 
Acacia, alfalfa, aneth, apple, apricot, artichoke, arugula, asparagus, avocado, banana, 
barley, beans, beet, blackberry, blueberry, broccoli, brussels sprouts, cabbage, canola, 
cantaloupe, carrot, cassava, cauliflower, celery, cherry, cilantro, citrus, Clementines, 
coffee, corn, cotton, cucumber, Douglas fir, eggplant, endive, escarole, eucalyptus, fennel, 

10 figs, gourd, grape, grapefruit, honey dew, jicama, kiwifruit, lettuce, leeks, lemon, lime, 
Loblolly pine, mango, melon, mushroom, nut, oat, oil seed rape, okra, onion, orange, an 
ornamental plant, papaya, parsley, pea, peach, peanut, pear, pepper, persimmon, pine, 
pineapple, plantain, plum, pomegranate, poplar, potato, pumpkin, quince, radiata pine, 
radicchio, radish, raspberry, rice, rye, sorghum, Southern pine, soybean, spinach, squash, 

15 strawberry, sugarbeet, sugarcane, sunflower, sweet potato, sweetgum, tangerine, tea, 
tobacco, tomato, triticale, turf, a vine, watermelon, wheat, yams, and zucchini. 

A double-stranded DNA molecule of the present invention containing an fda gene 
can be inserted into the genome of a plant by any suitable method. Suitable plant 
transformation vectors include those derived from a Ti plasmid of Agrobactehum 

20 tumefaciens, as well as those disclosed, e.g., by Herrera-Estrella et al. (1 983), Bevan 
(1984), Klee et al. (1985) and EPO publication 120.516. In addition to plant 
transformation vectors derived from the Ti or root-inducing (Ri) plasmids of 
Agrobacterium, alternative methods can be used to insert the DNA constructs of this 
invention into plant cells. Such methods may involve, for example, the use of liposomes, 

25 electroporation, chemicals that increase free DNA uptake, free DNA delivery via 

microprojectile bombardment, and transformation using viruses or pollen. DNA may also 
be inserted into the chloroplast genome (Daniell et al., 1998). 

A plasmid expression vector suitable for the introduction of an fda gene in 
monocots using microprojectile bombardment is composed of the following: a promoter 

30 that is specific or enhanced for expression in the starch storage tissues in monocots, 
generally the endosperm, such as promoters for the zein genes found in the maize 
endosperm (Pedersen et al., 1982); an intron that provides a splice site to facilitate 
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expression of the gene, such as the Hsp70 intron (PCT Publication W093/19189); and a 3' 
polyadenylation sequence such as the nopaline synthase 3' sequence (NOS 3'; Fraley et al., 
1983). This expression cassette may be assembled on high copy replicons suitable for the 
production of large quantities of DNA. 
5 A particularly useful Agrobacterium-bascd plant transformation vector for use in 

transformation of dicotyledonous plants is piasmid vector pMON530 (Rogers et al., 1987). 
Plasmid pMON530 is a derivative of pMON505 prepared by transferring the 2.3 kb Stul- 
Hindlll fragment of pMON316 (Rogers et al., 1987) into pMON526. Plasmid pMON526 
is a simple derivative of pMON505 in which the Smal site is removed by digestion with 

10 Xmal, treatment with Klenow polymerase and ligation. Plasmid pMON530 retains all the 
properties of pMON505 and the CaMV35S-NOS expression cassette and now contains a 
unique cleavage site for Smal between the promoter and polyadenylation signal. 

Binary vector pMON505 is a derivative of pMON200 (Rogers et al., 1987) in 
which the Ti plasmid homology region, LIH, has been replaced with a 3.8 kb Hindlll to 

15 Smal segment of the mini RK2 plasmid, pTJS75 (Schmidhauser and Helinski, 1985). This 
segment contains the RK2 origin of replication, oriV, and the origin of transfer, oriT, for 
conjugation into Agrobacterium using the tri-parental mating procedure (Horsch and Klee, 
1986). Plasmid pMON505 retains all the important features of pMON200 including the 
synthetic multi-linker for insertion of desired DNA fragments, the chimeric 

20 NOS/NPTII'/NOS gene for kanamycin resistance in plant cells, the 

spectinomycin/streptomycin resistance determinant for selection in E. coli and 
A. tumefaciens, an intact nopaline synthase gene for facile scoring of transformants and 
inheritance in progeny, and a pBR322 origin of replication for ease in making large 
amounts of the vector in E. coli. Plasmid pMON505 contains a single T-DNA border 

25 derived from the right end of the pTiT37 nopaline-type T-DNA. Southern blot analyses 
have shown that plasmid pMON505 and any DNA that it carries are integrated into the 
. plant genome, that is, the entire plasmid is the T-DNA that is inserted into the plant 
genome. One end of the integrated DNA is located between the right border sequence and 
the nopaline synthase gene and the other end is between the border sequence and the 

30 pBR322 sequences. 

Another particularly useful Ti plasmid cassette vector is pMON 17227. This vector 
is described in PCT Publication WO 92/04449 and contains a gene encoding an enzyme 
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conferring glyphosate resistance (denominated CP4), which is an excellent selection 
marker gene for many plants, including potato and tomato. The gene is fused to the 
Arabidopsis EPSPS chloroplast transit peptide (CTP2) and expressed from the FMV 
promoter as described therein. 

5 When adequate numbers of cells (or protoplasts) containing the fda gene or cDNA 

are obtained, the cells (or protoplasts) are regenerated into whole plants. Choice of 
methodology for the regeneration step is not critical, with suitable protocols being 
available for hosts from Leguminosae (alfalfa, soybean, clover, etc.), Umbelliferae (carrot, 
celery, parsnip), Cruciferae (cabbage, radish, canola/rapeseed, etc.), Cucurbitaceae 

10 (melons and cucumber), Gramineae (wheat, barley, rice, maize, etc.), Solanaceae (potato, 
tobacco, tomato, peppers), various floral crops, such as sunflower, and nut-bearing trees, 
such as almonds, cashews, walnuts, and pecans. See, e.g., Ammirato et al. (1984); 
Shimamoto et al. (1989); Fromm (1990); Vasil et al. (1990); Vasil et al. (1992); 
Hayashimoto (1990); and Datta et al. (1990). 

15 The following definitions are provided in order to aid those skilled in the art in 

understanding the detailed description of the present invention. 

The term "promoter" or "promoter region" refers to a nucleic acid sequence, 
usually found upstream (5') to a coding sequence, that controls expression of the coding 
sequence by controlling production of messenger RNA (mRNA) by providing the 

20 recognition site for RNA polymerase or other factors necessary for start of transcription at 
the correct site. As contemplated herein, a promoter or promoter region includes 
variations of promoters derived by means of ligation to various regulatory sequences, 
random or controlled mutagenesis, and addition or duplication of enhancer sequences. 
The promoter region disclosed herein, and biologically functional equivalents thereof, are 

25 responsible for driving the transcription of coding sequences under their control when 
introduced into "a host as part of a suitable recombinant vector, as demonstrated by its 
.ability to produce mRNA. 

"Regeneration" refers to the process of growing a plant from a plant cell (e.g., plant 
protoplast or explant). 

30 "Transformation" refers to a process of introducing an exogenous nucleic acid 

sequence (e.g., a vector, recombinant nucleic acid molecule) into a cell or.protoplast in 
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which that exogenous nucleic acid is incorporated into a chromosome or is capable of 
autonomous replication. 

A "transformed cell" is a cell whose DNA has been altered by the introduction of 
an exogenous nucleic acid molecule into that cell. 

5 The term "gene" refers to chromosomal DNA, plasmid DNA, cDNA, synthetic 

DNA, or other DNA that encodes a peptide, polypeptide, protein, or RNA molecule, and 
regions flanking the coding sequence involved in the regulation of expression. 

"Identity" refers to the degree of similarity between two nucleic acid or protein 
sequences. An alignment of the two sequences is performed by a suitable computer 

10 program. A widely used and accepted computer program for performing sequence 

alignments is CLUSTAL W vl .6 (Thompson et al., 1 994). The number of matching bases 
or amino acids is divided by the total number of bases or amino acids and multiplied by 
100 to obtain a percent identity. For example, if two 580 base pair sequences had 145 
matched bases, they would be 25 percent identical. If the two compared sequences are of 

15 different lengths, the number of matches is divided by the shorter of the two lengths. For 
example, if there were 100 matched amino acids between 200 and a 400 amino acid 
proteins, they are 50 percent identical with respect to the shorter sequence. If the shorter 
sequence is less than 50 bases or amino acids in length, the number of matches are divided 
by 50 and multiplied by 100 to obtain a percent identity. 

20 "C-terminal region" refers to the region of a peptide, polypeptide, or protein chain 

from the middle thereof to the end that carries the amino acid having a free carboxyl 
group. 

The phrase "DNA segment heterologous to the promoter region" means that the 
coding DNA segment does not exist in nature in the same gene with the promoter to which 
25 it is now attached. 

The term "encoding DNA" refers to chromosomal DNA, plasmid DNA, cDNA, or 
synthetic DNA that encodes any of the enzymes discussed herein. 

The term "genome" as it applies to bacteria encompasses both the chromosome and 
plasmids within a bacterial host cell. Encoding DNAs of the present invention introduced 
30 into bacterial host cells can therefore be either chromosomally integrated or plasmid- 
localized. The term "genome" as it applies to plant cells encompasses noj only 
chromosomal DNA found within the nucleus, but organelle DNA found within subcellular 
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components of the cell. DNAs of the present invention introduced into plant cells can 
therefore be either chromosomally integrated or organelle-localized. 

The terms "microbe" or "microorganism" refer to algae, bacteria, fungi, and 
protozoa. 

5 The term "mutein" refers to a mutant form of a peptide, polypeptide, or protein. 

"N-terminal region" refers to the region of a peptide, polypeptide, or protein chain 
from the amino acid having a free amino group to the middle of the chain. 

"Overexpression" refers to the expression of a polypeptide or protein encoded by a 
DNA introduced into a host cell, wherein said polypeptide or protein is either not normally 
10 present in the host cell, or wherein said polypeptide or protein is present in said host cell at 
a higher level than that normally expressed from the endogenous gene encoding said 
polypeptide or protein. 

The term "plastid" refers to the class of plant cell organelles that includes 
amyloplasts, chloroplasts. chromoplasts, elaioplasts, eoplasts, etioplasts, leucoplasts, and 
15 proplastids. These organelles are self-replicating and contain what is commonly referred 
to as the "chloroplast genome," a circular DNA molecule that ranges in size from about 
120 kb to about 217 kb, depending upon the plant species, and which usually contains an 
inverted repeat region. 

The phrase "simple carbohydrate substrate" means a monosaccharide or an 
20 oligosaccharide but not a polysaccharide; simple carbohydrate substrate includes glucose, 
fructose, sucrose, lactose. More complex carbohydrate substrates commonly used in 
media such as corn syrup, starch, and molasses can be broken down to simple 
carbohydrate substrates. 

The term "solids" refers to the nonaqueous component of a tuber (such as in 
25 potato) or a fruit (such as in tomato) comprised mostly of starch and other 

polysaccharides, simple carbohydrates, nonstructural carbohydrated, amino acids, and 
other organic molecules. 

The following examples are provided to better elucidate the practice of the present 
invention and should not be interpreted in any way to limit the scope of the present 
30 invention. Those skilled in the art will recognize that various modifications, truncations, 
etc., can be made to the methods and genes described herein while not departing from the 
spirit and scope of the present invention. 
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EXAMPLES 

EXAMPLE 1 

cDNA cloning and overexpression 

Unless otherwise stated, basic DNA manipulations and genetic techniques, such as 
5 PCR, agarose electrophoresis, restriction digests, ligations, E. coli transformations, colony 
screens, and Western blots were performed essentially by the protocols described in 
Sambrook et al. (1989) or Maniatis et al. (1982). 

The E. colifda gene sequence (SEQ ID NO: 1) was obtained from Genbank 
(Accession Number XI 4682) and nucleotide primers with homology to the 5' and 3' end 

10 were designed for PCR amplification. E. coli chromosomal DNA was extracted and the E. 
colifda gene was amplified by PCR using the 5' oligonucleotide 
5 'GGGGCCATGGCTAAG ATTTTTG ATTTCGTA3 ' (SEQ ID NO:3) and the 3' 
oligonucleotide 5'CCCCGAGCTCTTACAGAACGTCGATCGCGTTCAG3' (SEQ ID 
NO:4). The PCR cycling conditions were as follows: 94°C, 5 min (1 cycle); addition of 

15 polymerase; 94°C, 1 min., 60°C, 1 min., 72°C, 2 min.30 sec. (35 cycles). The 1.08 kb PCR 
product was gel purified and ligated into an E.coli expression vector, pMON5723, to form 
a vector construct that was used for transformation of frozen competent E.coli JM101 
cells. The pMON5723 vector contains the E.coli recA promoter and the T7 gene 10 leader 
(G10L) sequences, which enable high level expression in E.coli (Wong et al., 1988). After 

20 induction of the transformed cells, a distinct protein band of about 40 kDa was apparent on 
an SDS PAGE gel, which correlates with the size of the subunit polypeptide chain of the 
dimeric aldolase II. It was shown that most of the induced protein was present in the 
soluble phase. A gel slice containing the highly induced protein was isolated and 
antibodies were produced in a goat, which was injected with the homogenized gel slice 

25 (emulsified in Freund's complete adjuvant). 

The fda gene sequence was subsequently cloned into another E.coli expression 
. vector, under the control of the taq promoter. Induction with IPTG of JM101 cells 
transformed with this vector showed the same 40 kDa overexpressed protein band. This 
new clone was used in an enzyme assay for FDA activity. Cells transformed with this 

30 vector construct were grown in a liquid culture, induced with IPTG, and grown for another 
3 hours. Subsequently, a 3 mL cell culture was spun down, dissolved in 1.00mM Tris and 
sonicated. The cell pellet was spun down, and the crude cell extract supernatant was 
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assayed for FDA activity, using a coupled enzymatic assay as described by Baldwin et al. 
(1978). This assay was routinely performed at 30°C. 

The reaction was performed in a 1 mL final volume in excess presence of the 
enzymes triosephosphate isomerase (TIM) and alpha-glycerophosphate dehydrogenase 
5 (GDH) in a reaction mixture containing final concentrations of 1 OOmM Tris pH 8.0, 4.75 
mM fructose 1,6 bisphosphate, 0.15 mM NADH, 500 U/mL TIM, and 30 U/mL GDH. 

The decrease in absorbance at 340nm, after addition of the cell extract supernatant, 
was recorded using an HP diode array spectrophotometer. One international unit (I.U.) of 
aldolase activity is that causing the oxidation of 2 umol of NADH/min in this assay 
10 system. 

Cell extracts containing the vector with the fda sequence showed a substantial 
increase in aldolase activity (13.1 I.U./mg protein) as compared to cells transformed with 
the control vector (0.15 I.U./mg protein). The activity was shown to be inhibited by 
EDTA, known to specifically inhibit class II aldolases. 

15 EXAMPLE 2 

Plant transformation and fda expression in tobacco 
Targeting of FDA protein 

E.coli fructose 1 ,6 bisphosphate aldolase was targeted to the plastid in plants in 
order to assess its influence on carbohydrate metabolism and starch biosynthesis in these 

20 plant organelles. To accomplish the import of the E.coli aldolase into the plastids, a vector 
was constructed in which the aldolase was fused to the Arabidopsis small subunit transit 
peptide (CTP1) (Stark et al., 1992) or the maize small subunit CTP (Russell et al., 1993), 
creating constructs in which the CT?-fda fusion gene was located between the 35S 
promoter from the figwort mosaic virus (P-FMV35S; Gowda et al., 1989) and the 3'- 

25 nontranslated region of the nopaline synthase gene (NOS 3'; Fraley et al., 1983) 
sequences. The vector construct containing the expression cassette [P- 
FMV/CTPl//tfa/NOS3'] was subsequently used for tobacco protoplast transformation, 
which was performed as described in Fromm et al. (1987), with the following 
modifications. Tobacco cultivar Xanthi line D (Txd) cell suspensions were grown in 250- 

30 mL flasks, at 25°C and 138 rpm in the dark. For maintenance, a sub-culture volume of 9 
mL was removed and added to 40 mL of fresh Txd media containing MS salts, 3% 
sucrose, 0.2 g/L inositol, 0.13 g/L asparagine, 80 uL of a 50 mg/mL stock of PCPA, 5 uL 
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of a 1 mg/mL stock of kinetin, and 1 mL of lOOOx vitamins (1 .3 g/L nicotinic acid, 0.25 
g/L thiamine, 0.25 g/L pyridoxine HCL, and 0.25 g/L calcium pantothenate) every 3 to 4 
days. Protoplasts were isolated from 1 -day-old suspension cells that came from a 2-day- 
old culture. Sixteen milliliters of cells were added to 40 mL of fresh Txd media and 

5 allowed to grow 24 hours prior to digestion and isolation of the protoplasts. The 

centrifugation stage for the enzyme mix has been eliminated. The electroporation buffer 
and protoplast isolation media were fdter sterilized rather than autoclaved. The 
electroporation buffer did not have 4 mM CaCl2 added. The suspension cells were 
digested in enzyme for 1 hour. Protoplasts were counted on a hemacytometer, counting 

10 only the protoplasts that look intact and circular. Bio-rad Gene Pulser cuvettes (catalog # 
165-2088) with a 0.4-cm gap and a maximum volume of 0.8 mL were used for the 
electroporations. Fifty to 100 ug of DNA containing the gene of interest along with 5 ug 
of internal control DNA containing the luciferase gene were added per cuvette. The final 
protoplast density at electroporation was 2xl0 6 /mL and electroporater settings were a 500 

15 uFarad capacitance and 140 volts on the Bio-rad Gene Pulser. Protoplasts were put on ice 
after resuspension in electroporation buffer and remained on ice in cuvettes until 10 
minutes after electroporation. Protoplasts were added to 7 mL of Txd media + 0.4 M 
mannitol and conditioning media after electroporation. At this stage coconut water was no 
longer used. The protoplasts were grown in 1- hour day/night photoperiod regime at 26°C 

20 and were spun down and assayed or frozen 20-24 hours after electroporation. 

Western blot analysis performed on the protoplast extracts, obtained after 
transformation, showed processing into the mature FDA in the tobacco protoplasts. 
Expression was detected of a protein migrating at approximately 40 kDa, which is the 
molecular weight of the aldolase subunit and the size of the protein also observed after 

25 overexpression of the aldolase in E. coli. 

The expression cassette [P-FMV/CTPl//<fa/NOS3'] was subsequently cloned into 
the NotI site of pMON 17227 (described in PCT Publication WO 92/04449), in the same 
orientation as the selectable marker expression cassette, to form the plant transformation 
vector pMON17524, as shown in Figure 2 (SEQ ID NO: 5). 

30 An additional construct was made and used for tobacco protoplast transformation, 

fusing the fda gene to the Arabidopsis EPSPS transit peptide (CTP2), which is described 
in US patent 5,463,175. The transit peptide was cloned (through the SphI site) into the 
24 
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SphI site located immediately upstream from the N-terminus of the fda gene sequence in 
the CTPl-/efcr fusion (described above). This new CT?2-fda fusion gene was then cloned 
into a vector between the FMV promoter and the NOS 3' sequences. When this construct 
containing the CTVllfda gene sequences was used for tobacco protoplast transformation, 

5 expression was detected of a protein migrating at approximately 40 kDa, which is the 
molecular weight of the aldolase subunit and the size of the protein also observed after 
overexpression of the aldolase in E. coli. 

The NotI cassette [P-FMV/CTP2//&//NOS3'] from this construct was then cloned 
into the NotI site of pMON 17227, in the same orientation as the selectable marker 

10 expression cassette, to form the plant transformation vector pMON17542, which is shown 
in Figure 3 (SEQ ID NO:6). 

For cytoplasmic expression of the FDA in tobacco protoplasts, a construct was 
made in which the fda gene sequence (without being coupled to a transit peptide) was 
cloned into a vector backbone, between the FMV promoter and the NOS 3' sequences. 

15 Using this construct for tobacco protoplast transformation also showed expression of a 
protein of the same size, migrating at approximately 40 kDa. 
fda expression in tobacco plants 

Two constructs, containing the fda gene, fused to the Arabidopsis small subunit 
CTP1 (pMON17524) (SEQ ID NO:5, Figure 2) and the Arabidopsis EPSPS (CTP2) transit 

20 peptide (pMON 1 7542) (SEQ ID NO:6, Figure 3), were used for tobacco plant 

transformation, as described in US patent 5,463,175. A vector without the CTV-fda 
sequences, pMON 17227 (described in PCT Publication WO 92/04449), was used as a 
negative control. The plant transformation vectors were mobilized into the ABI 
Agrobacterium strain. Mating of the plant vector into the ABI strain was done by the 

25 triparental conjugation system using the helper plasmid pRK2013 (Ditta et al., 1980). 

Growth chamber-grown tobacco transformant lines were generated and first 
. screened by Western blot analysis to identify expressors using goat antibody raised against 
£.co//-expressed fda. Subsequently, for pMON17524-expressing tobacco lines, leaf 
nonstructural carbohydrates were analyzed (sucrose, glucose, and hydrolyzed starch into 

30 glucose) by means of a YSI Instrument, Model 2700 Select Biochemistry Analyzer. 

Starting at flowering stage, leaf samples were also taken from these plants and analyzed 
for diurnal changes in leaf nonstructural carbohydrates. 
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Five hundred milligrams to 1 g fresh tobacco leaf tissue samples were harvested 
and extracted in 5 mL of hot Na-phosphate buffer (40 g/L NaH 2 P0 4 and 10 g/L Na 2 H 2 P0 4 
in double de-ionized water) by homogenization with a Polytron. Test tubes were then 
placed in an 85°C water bath for 15 minutes. Tubes were centrifuged for 12 minutes at 
3000 rpm and the supernatants saved for soluble sugar analysis. The pellet was 
resuspended in 5 mL of hot Na-phosphate buffer mixed with a Vortex and centrifuged as 
described above. The supernatant was carefully removed and added to the previous 
supernatant fraction for soluble sugar (sucrose and glucose) analysis by YSI using 
appropriate membranes. 

The starch was extracted from the pellet using the Megazyme Kit (Megazyme, 
Australia). To the pellet, 200 uL of 50% ethanol and 3 mL of thermostable alpha-amylase 
(300U) were added and the mixture vortexed. Samples were then incubated in boiling 
water for 6 minutes and stirred after 2 and 4 minutes. Tubes were placed in 50°C water 
bath and 4 mL of 200 mM acetate buffer (pH 4.5) were added followed by 0.1 mL 
amyloglucosidase (20 U). Incubation occurred for 1 hour. Test tubes were then centrifuged 
for 15 minutes at 3000 rpm. Aliquots were taken from the supernatant and analyzed for 
glucose by YSI. The free glucose was adjusted to anhydrous glucose (as it occurs in starch 
by multiplying by the ratio 162/182). The total volume per tube was 7.1 mL. 

As seen in Table 1, expression of the fda gene in tobacco correlated with a 
significant increase in leaf starch levels. However, referring to Figure 4, when a diurnal 
profile of starch levels was established in the _/#a-expressing leaves, this increase was 
apparent mainly early in the photoperiod, which is a phase when leaves are known to have 
peak photosynthetic activity. This increase in starch has no apparent negative effect on the 
plant because the increased starch is turned over during the dark period. There was no 
apparent increase in steady state levels of sucrose or glucose in tobacco leaves expressing 
E.colifda as compared to the control. 



26 



WO 98/58069 



PCT/US98/12447 



Table 1 

Leaf Carbohydrate Levels of Plants Expressing 
the fda Transgene 1 (pMON 17524) 



High Expressors 
(>0.01% total protein) 



Low Expressors 

(<0.01%) 

(mg/g fresh weight) 



Negative 
Control 



STARCH 35.08 ± 2.84 23.25 ± 3.20 

SUCROSE 0.97 ± 0. 1 7 0.86 + 0.25 

GLUCOSE 1.88 + 0.17 1.58 ±0.20 



16.69 ±2.92 
0.66 ±0.19 
1.68 ±0.26 



Leaf samples were harvested at midday. 



A second set of transgenic tobacco plants transformed with the construct 

15 pMON 17542 were grown in the greenhouse. Tobacco plants containing a vector without 
the CT?-fda sequences, pMON 17227, were used as negative control. Of all the 
pMON17542-lines screened for expression by Western blot analysis, 18 were high 
expressors (>0.01% of the total cellular protein) and 15 lines were low expressors 
(<0.01%). Fifteen plants containing the null vector, pMON17227, were used as control. 

20 Fully expanded leaves from plants expressing the fda transgene and negative controls were 
tested for sucrose export by collecting phloem exudate from excised leaf systems. The 
phloem exudation technique is described in Groussol et al. (1986). Leaves were harvested 
at 1 1:30 AM and placed in an exudation medium, containing 5 mM EDTA at pH 6.0, and 
allowed to exude for a period of 4 hours under full light and high humidity. The exudation 

25 solution was immediately analyzed for sucrose level, as described above in the 

carbohydrate analysis method. As seen in Table 2, a significant increase in sucrose export 
out of source leaves was observed in plants expressing the fda transgene. 

This increase in sucrose export by jfc/a-expressing leaves is an illustration of an 
increase in source capacity, very likely due to an increased carbon flow through the Calvin 

30 Cycle (in response to increased triose-P utilization) and thus an increase in net carbon 
utilization by the leaf. As seen in Table 2, the increase in sucrose loading in the phloem 
correlates with the level of fda expression. 
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Table 2 

Levels of Sucrose in Phloem Exudate from 
Excised Leaves of fda Transgenic Tobacco Plants (pMON17542) 
Water uptake sucrose in phloem exudate 

5 (ul/g F.Wt./h) (ng/leaf) (ng/g F.Wt.) 



fda high expressors 320 + 20 

fda low expressors 340 + 10 

10 

Control 390 + 30 



Referring to Table 3, preliminary analysis of plant growth and development 
revealed no significant differences in number of leaves or pods per plant, plant height, 

15 stem diameter, or apparent seed weight per plant, between plants expressing the fda gene 
and the vector control under the specific growing and analysis conditions. However, as 
seen in Table 4, the /da-transgenic plants had a significantly higher root mass. This may 
be an indication that, under these conditions, roots represented a more dominant sink that 
attracted excess carbohydrate produced by the source leaves. Furthermore, the present 

20 illustration shows that the increase in root mass in the presence of the E.colifda gene was 
accomplished with no apparent negative effect on shoot growth, inflorescence, or seed set. 
Therefore, this increase in root growth and final root dry weight is a desirable plant trait 
because it would lead to a rapid seedling establishment following germination and greater 
plant ability to tolerate drought, cold stress, other environmental challenges, and 

25 transplanting. In different plants and under different growing conditions, other plant parts 
(such as seed, fruit, stem, leaf, tuber, bulb, etc.) are expected to show the weight increase 
' observed in tobacco roots overexpressing the fda transgene. 



330 + 60 
210+10 
160 ± 10 



108 ±22 
77 ±3 
56 ±3 
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Table 3 

Assessment of Certain Plant Growth and Development Parameters in 
Tobacco Expressing the fda Transgene 1 (pMON 17542) 
#pods/plant #leaves/plant Plant height Seed weight 

5 (cm) (g/plant) 

high expressors 162 + 40 25.4 + 0.8 65.3 + 3.1 

Control 156 + 28 24.4 + 0.5 65.8 + 5.1 

1 To achieve this analysis, 14 high-expressor lines were compared to 15 control plants. 
10 Measurements were made prior to seed harvest (most pods have reached maturity). The number of 

leaves was confirmed by counting the number of nodes to account for leaf drop. 



18.8 ±2.4 
17.3 ±2.6 



Table 4 

Tobacco Root Dry Weight of Plants Expressing 
the E.colifda Transgene 1 (pMON 17542) 



Root Dry Weight 
(g/plant) 



fda high expressors 64.0 ±3.9 

fda low expressors 62.7 ±5.4 

Control 31.7 ±1.6 



1 Roots from 5 high and 7 low expressing lines and 6 control plants were excised and washed 
carefully then placed in a 65°C drying oven for at least 48 hours. Roots were removed from the 
oven and allowed to equilibrate in the laboratory for 2 hours before dry weight determination. 
EXAMPLE 3 

Plant transformation and fda expression in corn plants 
Targeting of FDA protein 

Vectors containing the /da gene with and without the plastid targeting peptide 
were made for transformation in corn and are also suitable for other monocots, including 
rice, wheat, barley, sugarcane, triticale, etc. 
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For the cytosolic expression of the fda gene in corn plants, a construct was made 
in which the fda gene sequence was fused to the backbone of a vector containing the 
enhanced CaMV 35S promoter (e35S; Kay et al., 1987), the HSP70 intron (US patent 
5,593,874), and the NOS3' polyadenylation sequence (Fraley et al., 1983). This created a 

5 NotI cassette [P-e35S/HSP70 intron//<fa/NOS3'] that was cloned into the NotI site of 
pMON30460, a monocot transformation vector, to form the plant transformation vector 
pMON13925, as shown in Figure 5. pMON30460 contains an expression cassette for the 
selectable marker neomycin phosphotransferase typell gene (nptll) [P-35S/NPTII /NOS3'] 
and a unique NotI site for cloning the gene of interest. The final vector (pMON13925) 

10 was constructed so that the gene of interest and the selectable marker gene were cloned in 
the same orientation. A vector fragment containing the expression cassettes for these gene 
sequences could be excised from the bacterial selector (Kan) and ori, gel purified, and used 
for plant transformation. 

For the chloroplast-targeted expression of the fda gene in corn plants, a construct 

15 was made in which the fda gene sequence, coupled to the maize RUBISCO small subunit 
CTP (Russell et al., 1993), was fused to the backbone of a vector containing the enhanced 
(CaMV) 35S promoter, the HSP70 intron, and the NOS3' polyadenylation sequences. This 
created a NotI cassette [P-e35S/HSP70 intron/mzSSuCTP//c/a/NOS3'] that was cloned 
into the NotI site (in the same orientation as the selectable marker cassette [P-35S/NPTII 

20 /NOS3']) of the monocot transformation vector pMON30460, to form the vector 

pMON 17590, as shown in Figure 6. From this vector a fragment containing the fda gene 
expression cassette and the selectable marker cassette could be excised from the bacterial 
selector (Kan) and ori, gel purified, and used for plant transformation. 

For the cytosolic endosperm-specific expression of the aldolase gene in corn, the 

25 fda gene sequence was cloned into a vector (in the same orientation as the selectable 
marker cassette"[P-35S/NPTII /NOS3']) containing the glutelin gene promoter P-osgtl 
. (Zheng et al., 1993), the HSP70 intron, and the NOS3' polyadenylation sequences to form 
the vector pMON 13936, as shown in Figure 7. From this vector a fragment containing the 
fda gene expression cassette [P-osgtl/HSP70intron//c?a/NOS3'] and the selectable marker 

30 cassette could be excised from the bacterial selector (Kan) and ori, gel purified, and used 
for plant transformation. 
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Maize plant transformation 

Transgenic maize plants transformed with the vectors pMON13925 (described 
above) or pMON17590 (described above) were produced using microprojectile 
bombardment, a procedure well-known to the art (Fromm, 1990; Gordon-Kamm et al., 

5 1990; Walters et al., 1992). Embryogenic callus initiated from immature maize embryos 
was used as a target tissue. Plasmid DNA at lmg/mL in TE buffer was precipitated onto 
M10 tungsten particles using a calcium chloride / spermidine procedure, essentially as 
described by Klein et al. (1988). In addition to the gene of interest, the plasmids also 
contained the neomycin phosphotransferase II gene (nptll) driven by the 35S promoter 

10 from Cauliflower Mosaic Virus. The embryogenic callus target tissue was pretreated on 
culture medium osmotically buffered with 0.2M mannitol plus 0.2M sorbitol for 
approximately four hours prior to bombardment (Vain et al., 1993). Tissue was 
bombarded two times with the DNA-coated tungsten particles using the gunpowder 
version of the BioRad Particle Delivery System (PDS) 1000 device. Approximately 16 

15 hours following bombardment, the tissue was subcultured onto a medium of the same 
composition except that it contained no mannitol or sorbitol, and it contained an 
appropriate aminoglycoside antibiotic, such as G418", to select for those cells that 
contained and expressed the 35S/nptII gene. Actively growing tissue sectors were 
transferred to fresh selective medium approximately every 3 weeks. About 3 months after 

20 bombardment, plants were regenerated from surviving embryogenic callus essentially as 
described by Duncan and Widholm (1988). 
Aldolase activity from transgenic maize 

In order to measure leaf aldolase activity, corn leaf samples were taken and 
immediately frozen on dry ice. Aldolase enzyme was extracted from the leaf tissue by 

25 grinding the leaf tissue at 4°C in 1 .2 mL of the extraction buffer (1 00 mM Hepes, pH 8.0, 
5 mM MgCl 2 , 5"mM MnCl 2 , 100 mM KC1, 10 mM DTT, 1% BSA, 1 mM PMSF, 10 
. ug/mL leupeptin, 10 ^ig/mL aprotinin). The extract was centrifuged at 15,000 x g, at 4°C 
for 3 minutes, and the non-desalted supernatant was assayed for enzyme activity. This 
extraction method gave about 60% recovery of E. coli FDA activity. 

30 Total aldolase activity was determined in 0.98 mL of reaction mixture that 

consisted of 100 mM EPPS-NaOH, pH 8.5, 1 mM fructose-bisphosphate v 0.1 mM NADH, 
5 mM MgCl 2 , 4 units of alpha-glycerophosphate dehydrogenase, and 15 units of 
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triosephosphate isomerase. The reaction was initiated by addition of 20 uL of leaf extract. 
The resulting data, generated from a single experiment, are presented in Table 5. 



Table 5 

Aldolase Activity from Transgenic Maize Leaves 



Lines 


A340/min/20uL 


Activity % 


H99 (control) 


0.113 


100 


pMON 17590 


0.233 


206 


pMON13925 


0.251 


222 



A phenotype was visible in the primary transformants (RO plants) expressing the 

10 E. coli FDA when the protein was targeted to the chloroplast. The leaves were chlorotic 
but seed set was normal. Rl plants were grown in both field and in greenhouse 
experiments. Starch was not detectable in the leaves using an iodine staining and 
pollination was delayed. It is believed that the phenotype in these corn plants may be the 
result of the promoter (e35S) used in both the pMON17590 and pMON13925 vectors not 

15 being preferred for causing FDA expression in corn. Because e35S is believed to cause 
mesophyll enhanced expression and the Calvin Cycle in a C4 plant such as corn occurs 
predominantly in the bundle sheath cells, the use of a promoter directing enhanced 
expression in the bundle sheath cells (such as the ssRUBISCO promoter) may be 
preferred. Vectors containing such a promoter and driving expression of FDA have been 

20 prepared and are being tested in maize. 

In particular, the maize RuBISCO small subunit (PmzSSU, a bundle sheath cell- 
specific promoter) has been used to construct vectors for cell-specific fda expression in 
maize. A class I aldolase ifdal), an fda without an iron sulfur cluster and with different 
properties fromfdall, was utilized to improve carbon metabolism in C4 crops (e.g. maize) 

25 . The gene for the class I aldolase was amplified from the genome of Staphylococcus 
aureus and activity was comfirmed. Transformation vectors were then constructed to 
express both classes of aldolase (fdal and fdall) in a cell-specific manner in maize. The 
following cassettes have been made: 
pMON13899: PmzSSU/hsp70/mzSSU CYVIfdal 

30 pMON13990PmzSSU/hsp70/mzSSUCTP//ya/7 
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pMON 1 3 98 8 :P3 5 S/hsp70//&*/. 

These vectors were used for corn transformation as described generally above. The 
biochemical and physiological analysis of the primary transformants should allow for the 
identification of aldolase gene overexpression that will lead to increase growth and 
5 development and yield in maize. 

Also, two vectors were used for transformation of corn which would target the 
expression of the E. colifda II gene in the maize endosperm. The vector pMON 13936 
uses the rice gtl promoter to drive expression of aldolase in the cytoplasm of the 
endosperm cells. Another vector (pMON 36416) uses the same promoter with the maize 

10 RuBISCO small subunit transit peptide to localize the protein in the amyloplasts. 
Homozygous lines of the cytosolic aldolase transformants have been identified 
(Homozygosity of 37 plants was confirmed using western blot analysis) and seed from 
these plants were collected for grain composition analysis (moisture, protein, starch, and 
oil). Of the 53 pMON 36416 primary transformants screened for amylopast-targeted 

15 aldolase expression, 1 1 were positive. These plants will be tested for homozygosity 
selection/propagation and kernels from the homozygotes will be used for composition 
analysis. 
EXAMPLE 4 

Plant transformation and fda expression in potato plants 
20 Targeting of fda expression 

The plant expression vector, pMON 17542 (described earlier), in which the fda 
gene is expressed behind the FMV promoter and the aldolase enzyme is fused to the 
chloroplast transit peptide CTP2, was used for Agrobacterium-mediated potato 
transformation. 

25 A second potato transformation vector was constructed by cloning the NotI 

cassette [P-FMV/CTP2//tfa/NOS3'] (described earlier) into the unique NotI site of 
. pMON23616. pMON23616 is a potato transformation vector containing the nopaline-type 
T-DNA right border region (Fraley et al., 1985), an expression cassette for the neomycin 
phosphotransferase typell gene [P-35S/NPTII /NOS3'] (selectable marker), a unique NotI 

30 site for cloning the gene expression cassette of interest, and the T-DNA left border region 
(Barker et al., 1983). Cloning of the NotI cassette [P-FMV/CTP2//tfa/NOS3'j (described 
earlier) into the NotI site of pMON23616 results in the potato transformation vector 
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pMON 17581, as shown in Figure 8. The vector pMON17581 was constructed such that 
the gene of interest and the selectable marker gene were transcribed in the same direction. 
Potato plant transformation 

The plant transformation vectors were mobilized into the ABI Agrobacteriwn 

5 strain. Mating of the plant vector into the ABI strain was done by the triparental 

conjugation system using the helper plasmid pRK2013 (Ditta et al., 1980). The vector 
pMON 17542 was used for potato transformation via Agrobacterium transformation of 
Russet Burbank potato callus, following the method described in PCT Publication WO 
96/03513 for glyphosate selection of transformed lines. 

1 0 After transformation with the vector pMON 1 7542, transgenic potato plantlets that 

came through selection on glyphosate were screened for expression of E. coli aldolase by 
leaf Western blot analysis. Out of 1 12 independent lines assayed. 50 /^-expressing lines 
(45%) were identified, with fda expression levels ranging between 0.12% and 1.2 % of 
total extractable protein. 

1 5 The plant transformation vector PMON 17581 was used for Agrobacterium- 

mediated transformation of HS3 1-638 potato callus. HS3 1-638 is a Russet Burbank potato 
line previously transformed with the mutant ADPglucose pyrophosphorylase (glgC16) 
gene from E.coli (U.S. Patent 5,498,830). The potato callus was transformed following the 
method described in PCT Publication WO 96/03513, substituting kanamycin 

20 (administered at a concentration of 1 50-200 mg/L) for glyphosate as a selective agent. 

The transgenic potato plants were screened for expression of the fda gene by 
assaying leaf punches from tissue culture plantlets. Western blot analysis (using antibodies 
raised against the E. coli aldolase) of leaf tissue from the pMON 17581 -transformed lines 
identified 12 expressing lines out of 56 lines screened. Expression was detected of a 

25 protein migrating at approximately 40 kDa, which is the molecular weight of the E. coli 
(classll) aldolase subunit and the size of the protein observed after overexpression of the 
. aldolase in E. coli. 

Specific gravity measurements of transgenic potato plants 

From the 50 /2/a-expressing potato lines obtained after transformation with 
30 pMONl 7542, 7 of the highest expressing lines were micropropagated in tissue culture, and 
8 copies of each line were planted in pots 14 inches in diameter and 12 inches deep, 
containing a mixture of: Vi Metro 350 potting media, l A fine sand, l A Ready Earth 
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potting media. Wild-type Russet Burbank plantlets from tissue culture were planted as 
controls. All plants were cultivated for approximately 5 months in the greenhouse in 
which daytime temperature was approximately 21-23°C while nighttime temperature was 
approximately 13°C. Plants were watered every other day throughout their active growing 

5 period and fertilized with Peter's 20-20-20 commercial fertilizer once a week, at levels 
similar to commercial applications. Fertilization was carried out only for the first 2 Vi 
months, at which point fertilization was stopped completely. Plants were allowed to 
naturally senesce, and at approximately 50% senescence, tubers were harvested. 

For each line at harvest, all tubers from all 8 pots were pooled and a total weight 

10 was obtained. Then for each line, tubers 30 g or greater were pooled and specific gravity 
was determined on this group of tubers. Specific gravity is the weight of the tubers in air 
divided by the weight in air minus the weight in water. Results of these weight 
measurements are presented in Table 6. 

Table 6 

15 Specific gravity measurements from transgenic potato plants 



Line # 


Total 


Overall 


Combined 


% Increase in 


Combined Weight of 


Specific 




Weight 


% Yield 


Weight 


Total Weight 


Tubers over 30g 


Gravity 






Increas 


of Tubers 


(Tubers over 


(% of Total Weight) 








e 


over 30g 


30g) 






RB 


6609 




4477 




67.70% 


1.087 


40652 


5138 


neg 


1307 


neg 


25.40% 


1.08 


40611 


7170 


8.5% 


4533 


1.3% 


63.20% 


1.083 


40608 


7470 


13.0% 


1070 


neg 


14.30% 


1.081 


40632 


7776 


21.8% 


5453 


21.8% 


70.10% 


1.088 


40614 


8688 


31.5% 


5468 


22.2% 


62.90% 


1.083 


40631 


8800 


33.2% 


6188 


38.2% 


70.30% 


1.084 


40610 


9746 


47.0% 


7777 


73.0% 


80% 


1.087 



This table summarizes the tuber yield and specific gravity for all seven lines grown in the 
greenhouse. The results indicate that, in comparison to the control, all but one of the fda 
lines show an increase in overall tuber yield, and that in four lines, there is a corresponding 

20 increase in percentage of tubers that weigh more than 30 g. For combined tubers over 30 
g, the percent of total weight is near that of the control, and for two lines is greater than the 
control. This indicates that five out of the six of the lines show higher overall yield and 
are not making smaller tubers. In other words, with the increase in overall yield, there is a 
corresponding increase in percentage of bigger tubers (over 30 g). However, there is no 

25 increase in specific gravity of the tubers. 
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In conclusion, it appears that expression of fda in potato produces greater numbers 
of tubers per plant without a sacrifice in tuber size. This represents a yield benefit in that 
the farmer could potentially be able to produce the same amount of tubers using less 
acreage. Similar experiments will also be performed by co-expression oifda with other 
carbohydrate metabolizing genes, such as glgC16, in order to determine how such 
combinations will affect tuber yield, tuber solids deposition and overall tuber specific 
gravity. 

Aldolase activity from transgenic potato 

After being cultivated for 3 months (post planting) in the greenhouse, leaf samples 
were taken from 6 of the highest /^-expressing potato lines, obtained after transformation 
with pMON17542, and assayed for aldolase activity. 

In order to measure potato leaf aldolase activity, duplicate leaf samples from each 
line were taken and immediately frozen on dry ice. Aldolase was extracted from 0.2 g of 
leaf tissue by grinding at 4°C in 1.2 mL of the extraction buffer: 100 mM Hepes, pH 8.0, 
5 mM MgCl 2 , 5 mM MnCl 2 , 100 mM KC1, 10 mM DTT, 1% BSA, ImM PMSF, 10 
ug/mL leupeptin, 10 ug/mL aprotinin. The extract was assayed for aldolase activity as 
described earlier. 

Six independent transgenic potato lines expressing/da were tested for aldolase 
activity. The expression of fda in leaves is an indicator of the expression in the whole 
plant because the FMV promoter used to drive expression of the respective encoding 
DNAs directs gene expression constitutively in most, if not all, tissues of potato plants. 

Table 7 summarizes the quantitative protein expression data for each of the lines, 
and the percent activity for each individual line. 



36 



WO 98/58069 



PCT/US98/12447 



Table 7 
Aldolase Activity from 
Transgenic Russet Burbank Potato Leaves 
Exp. #1 Exp. #2 Average 

5 Lines Act(LVgFW) %Act Act(LVgFW) %Act % Activity 



Control 


4.461 


100 


4.732 


100 


100 


40608 


6.969 


156 


8.055 


170 


163 


40610 


8.489 


190 


7.326 


155 


173 


40652 


5.812 


130 


6.367 


135 


132 


40632 


5.257 


118 


4.244 


90 


104 


40631 


5.764 


129 


4.968 


105 


117 


40611 


5.715 


128 


5.836 


123 


126 



Solids uniformity in transgenic potato 

15 Twenty-five Russet Burbank lines expressing fda (potato lines designated 

"Maestro"), obtained after transformation with pMON 17542, and fifteen Russet Burbank 
Simple Solid lines, also containing glgC\6 (PCT Publication WO 91/19806 and US Patent 
5,498,830), expressing fda (potato lines designated "Segal"), obtained after transformation 
with pMON 17581, were field tested at two different sites. For each field site, 36 plants 

20 per line (three repetitions of 12 plants per line) were evaluated for tuber solids distribution. 
At harvest, tubers were pre-sorted at each field site into a ten to twelve ounce category, 
and nine tubers from each replicated plot were analyzed in groups of three. 

For a typical 10-12 ounce tuber having a diameter of 7-8 cm, starch distribution 
was evaluated by removing the center longitudinal slice (13 mm) from each tuber. Slices 

25 were then peeled and laid flat on a cutting board where the inner tuber region (pith region) 
was removed by a 14-mm cork punch. The tissue from pith to cortex (perimedullary 
region) was removed by an up-to-a 2-inch cork punch. The remaining cortex tissue was 
approximately an 8-mm wide ring from the outermost region of the slice. 

Specific gravity was then determined by weighing both the pooled pith punches 

30 and pooled cortex punches in air and then in water: 

Specific gravity = Air Wt./(Air Wt.-Water Wt.) 
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After calculating specific gravity, solids levels were determined by the following equation: 

-214.9206 + (2 18.1 852* Sp. Gravity) 
The degree of solids uniformity (Solids Uniformity Index) is determined by calculating the 
pith to cortex solids ratio (pith solids divided by cortex solids). The three groups of three 
5 tubers per plot were averaged, at which point the average of three plot replications was 
calculated per field site. 

Analyses of several previous solids uniformity field trials (data not shown) have 
demonstrated nontransgenic, wild-type Russet Burbank potato to have a typical pith to 
cortex tuber solids ratio within the range of 68% to 72%, depending on growing region 
10 and agricultural practices. Tables 8-1 1 provide the pith to cortex solids ratios by plant line 
number, with a higher pith to cortex solids ratio indicating a greater degree of solids 
uniformity. 

Tables 8 and 9 represent the data from one field site (site 1 ) for Segal and Maestro, 
respectively, and illustrate that the majority of Segal and Maestro lines have higher pith to 

15 cortex solids ratios than that of 68.4% for the Russet Burbank control, with some lines 
approaching an 82% pith to cortex solids ratio. 

Tables 10 and 1 1 represent the data from another field site (site 2) for Segal and 
Maestro, respectively, and also illustrate that the majority of Maestro and Segal lines have 
higher pith to cortex solids ratios than that of the Russet Burbank control, with some lines 

20 approaching an 88% pith to cortex solids ratio. In the site 2 field trial, the Russet Burbank 
control had an atypical, abnormally high pith-to-cortex solids uniformity ratio of 79.3%, 
which was most likely due to environmental growing conditions. The site 2 results 
demonstrate that expression in Russet Burbank potato of E. colifda, alone or with co- 
expression of g/gC16, increases tuber solids uniformity even in a growing season when 

25 tuber solids uniformity is already extremely high in nontransgenic Russet Burbank. That 
is, the fda gene continues to perform when agricultural conditions are already conducive to 
. an abnormally high solids uniformity level. 



38 



WO 98/58069 



PCT/US98/12447 



Table 8. Solids Uniformity Index: Pith Solids to Cortex Solids Ratio. 
Segal Russet Burbank Lines. Site 1 





Ratio 


S-29 


79.1 


S-9 


75.8 


S-20 


71.3 


S-15 


71.3 


S-21 


70.5 


S-5 


70.2 


S-18 


70.0 


RB control 


68.4 


S-32 


68.3 


S-16 


65.6 



15 

Table 9. Solids Uniformity Index: Pith Solids to Cortex Solids Ratio. 
Maestro Russet Burbank Lines. Site 1 





Line 


Ratio 


20 


M-13 


74.0 




M-12 


73.6 




M-l 


73.4 




M-3 


73.0 




M-6 


72.4 


25 


M-9 


71.2 




M-ll 


70.6 




M-l 8 


70.5 




M-17 


69.9 




M-19 


69.4 


30 


M-5 


69.3 




M-20 


68.9 




RB control 


68.4 




M-8 


68.3 




M-43 


67.7 


35 


M-23 


67.3 




M-7 


67.0 




M-39 . 


66.6 




M-22 


66.0 




M-10 


65.4 


40 


M-27 


61.4 
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Table 10. Solids Uniformity Index: Pith Solids to Cortex Solids Ratio 
Segal Russet Burbank Lines. Site 2 



Line 


Ratio 


S-33 


87.4 


S-54 


87.1 


S-05 


86.8 


S-29 


85.1 


S-21 


84.3 


S-16 


83.2 


S-20 


81.5 


S-18 


80.7 


S-32 


80.6 


RB control 


79.3 


S-09 


79.0 



Table 11. Solids Uniformity Index: Pith Solids to Cortex Solids Ratio 
Maestro Russet Burbank Lines. Site 2 



20 


Line 


Ratio 




M-04 


87.7 




M-18 


83.9 




M-17 


83.8 




M-03 


83.7 


25 


M-09 


83.4 




M-15 


83.2 




M-29 


82.9 




M-44 


82.3 




M-08 


82.2 


30 


M-43 


81.6 




M-22 


81.1 




M-05 


80.8 




M-01 


80.5 




M-20 


80.2 


35 


M-45 


79.6 




M-39 


79.5 




M-27 . 


79.5 




RB control 


79.3 




M-13 


78.9 


40 


. M-22 


78.8 




M-19 


78.7 




M-07 


78.2 




M-12 


77.9 




M-23 


77.3 


45 


M-06 


76.5 




M-10 


75.0 




M-ll 


74.1 
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The effect of aldolase on pith to cortex solids ratios in the Segal lines is slightly 
more dramatic than in Maestro lines. We believe this phenotype is due to expression of 
fda in a background in which the Russet Burbank host expresses g/gC16 at a relatively low 
to moderate level, and that the combination of fda plus glgC\6 provides improved 
5 benefits. Cross sectional tuber slices (Figure 9) of three Segal lines with improved solids 
uniformity illustrate a greater deposition of starch within the inner regions of the tuber. 
Specifically, an increase in cortex volume accompanied by relocation of the xylem ring 
towards the center of the tuber, plus a more opaque pith tissue due to an increase in starch 
density, are evident in the transgenic lines. This physiological alteration may be due to an 
10 increase in sucrose translocation from source to sink, which may influence phloem 
element distribution during tuber development or sucrose availability for starch 
biosynthesis across the tuber. 
Example 5 

Plant transformation and FDA expression in cotton plants 

15 The E. colifda vectors pMON 17524 [FMV/CTPl//da] (Figure 2) and 

pMON 17542 [FMV/CTP2//</a] (Figure 3) were transformed into cotton using 
Agrobacterium as described by Umbeck et al. (1987) and in US Patent 5004863. The 
protein was targeted to the chloroplast using either the Arabidopsis SSU CTP 1 
(pMON 17524) or the Arabidopsis EPSPS (pMON 17542) chloroplast transit peptide. 

20 Aldolase expression in cotton 

Five-week-old calli transformed with both vectors were analyzed by Western blot 
analyses and by aldolase assays. Western blot analysis indicated a large amount of protein 
at the position of the full-length FDA standard and a lesser amount at the same position in 
the control callus extracts. It appeared that the protein was fully processed. To verify that 

25 FDA was expressed in the tissue and for comparison of activity, calli transformed with the 
two vectors were extracted in a buffer that would prevent loss of activity of the transgene 
. product. BSA was added to final concentration of 1 mg/mL, which limited the analysis of 
processing on import by Western blot. Aldolase assays were performed plus or minus 25 
mM EDTA, which inhibits the E. coli enzyme but not the plant enzyme. The results of the 

30 assays are shown in Table 12. 
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Table 12 

Aldolase Activity in Cotton Calli and Cotton Leaf 
A A340 e"3/mg protein/5 min 



Colony# 



-EDTA +EDTA Fold Increase 



Controls 

Cotton Leaf (Coker) 
Uninoculated Calli 
Inoculated Calli (E35S/GUS) #1 
#2 

FDA calli 

pMON 17542 #1 

#3 
#5 
#4 

pMON 17524 #2 

#3 
#5 
#6 



4.0 
7.7 
6.8 
3.5 

3.5 
5.5 
9.2 
19.8 
15.2 
12.5 
14.4 
4.1 



4.2 

5.6 
6.1 
4.0 

2.3 
2.6 
3.8 
3.6 
5.8 
4.0 
2.9 
1.2 



1.5X 
2.1X 
2.4X 
5.5X 
2.6X 
3.1X 
4.9X 
3.5X 



20 The results indicate that there is good expression of the/Ja gene in cotton callus. Almost all 
calli had at least twofold higher aldolase activity, and the increase was sensitive to inhibition by 
EDTA. Processing appeared complete by Western blot analysis using these samples. 

25 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Gerard Barry 

NordineCheikh 
Ganesh Kishore 

(ii) TITLE OF INVENTION: Expression of Fructose 1,6 Bisphosphate 
Aldolase in Transgenic Plants 

(iii) NUMBER OF SEQUENCES: 6 



(A) ADDRESSEE: Arnold, White & Durkee 

(B) STREET: P.O. Box 4433 

(C) CITY: Houston 

(D) STATE: Texas 

(E) COUNTRY: United States of America 

(F) ZIP: 77210-4433 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US Unknown 

(B) FILING DATE: Concurrently Herewith 

(C) CLASSIFICATION: Unknown 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US Prov. App . Serial No. 60/049,995 

(B) FILING DATE: June 17, 1997 



(viii) ATTORNEY/AGENT INFORMATION: 
40 (A) NAME: Patricia A. Kammerer 

(B) REGISTRATION NUMBER: 29,775 

(C) REFERENCE /DOCKET NUMBER: MOBT086 

(ix) TELECOMMUNICATION INFORMATION: 
45 . (A) TELEPHONE: (713) 787-1400 

(B) TELEFAX: (713) 787-1440 



50 (2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1080 base pairs 

(B) TYPE: nucleic acid 
55 (C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

ATGTCTAAGA TTTTTGATTT CGTAAAACCT GGCGTAATCA CTGGTGATGA CGTACAGAAA 
10 60 

GTTTTCCAGG TAGCAAAAGA AAACAACTTC GCACTGCCAG CAGTAAACTG CGTCGGTACT 
120 

15 GACTCCATCA ACGCCGTACT GGAAACCGCT GCTAAAGTTA AAGCGCCGGT TATCGTTCAG 
180 

TTCTCCAACG GTGGTGCTTC CTTTATCGCT GGTAAAGGCG TGAAATCTGA CGTTCCGCAG 
240 

20 

GGTGCTGCTA TCCTGGGCGC GATCTCTGGT GCGCATCACG TTCACCAGAT GGCTGAACAT 
300 

TATGGTGTTC CGGTTATCCT GCACACTGAC CACTGCGCGA AGAAACTGCT GCCGTGGATC 
25 360 

GACGGTCTGT TGGACGCGGG TGAAAAACAC TTCGCAGCTA CCGGTAAGCC GCTGTTCTCT 
420 

30 TCTCACATGA TCGACCTGTC TGAAGAATCT CTGCAAGAGA ACATCGAAAT CTGCTCTAAA 
480 

TACCTGGAGC GCATGTCCAA AATCGGCATG ACTCTGGAAA TCGAACTGGG TTGCACCGGT 
540 

35 

GGTGAAGAAG ACGGCGTGGA CAACAGCCAC ATGGACG CTT CTGCACTGTA CACCCAGCCG 
600 

GAAGACGTTG ATTACGCATA CACCGAACTG AGCAAAATCA GCCCGCGTTT CACCATCGCA 
40 660 

GCGTCCTTCG GTAACGTACA CGGTGTTTAC AAGCCGGGTA ACGTGGTTCT GACTCCGACC 
720 

45 ATCCTGCGTG ATTCTCAGGA ATATGTTTCC AAGAAACACA ACCTGCCGCA CAACAGCCTG 
780 

. AACTTCGTAT TCCACGGTGG TTCCGGTTCT ACTGCTCAGG AAATCAAAGA CTCCGTAAGC 
840 

50 

TACGGCGTAG TAAAAATGAA CATCGATACC GATACCCAAT GGGCAACCTG GGAAGGCGTT 
900 

CTGAACTACT ACAAAGCGAA CGAAGCTTAT CTGCAGGGTC AGCTGGGTAA CCCGAAAGGC 
55 960 

GAAGATCAGC CGAACAAGAA ATACTACGAT CCGCGCGTAT GGCTGCGTGC CGGTCAGACT 
1020 
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TCGATGATCG CTCGTCTGGA GAAAGCATTC CAGGAACTGA ACGCGATCGA CGTTCTGTAA 
1080 

5 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 359amino acids 
10 (B) TYPE: amino 

(C) STRANDEDNESS : 

(D) TOPOLOGY: Linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 

Met Ser Lys He Phe Asp Phe Val Lys Pro Gly Val He Thr Gly 

5 10 15 

Asp Asp Val Gin Lys Val Phe Gin Val Ala Lys Glu Asn Asn Phe 

20 25 30 

Ala Leu Pro Ala Val Asn Cys Val Gly Thr Asp Ser He Asn Ala 

35 40 45 

Val Leu Glu Thr Ala Ala Lys Val Lys Ala Pro Val He Val Gin 

50 55 60 

Phe Ser Asn Gly Gly Ala Ser Phe He Ala Gly Lys Gly Val Lys 

65 70 75 

Ser Asp Val Pro Gin Gly Ala Ala He Leu Gly Ala He Ser Gly 



Ala His His Val His Gin Met Ala Glu His Tyr Gly Val Pro Val 

95 100 105 

He Leu His Thr Asp His Cys Ala Lys Lys Leu Leu Pro Trp He 

HO 115 120 

Asp Gly Leu Leu Asp Ala Gly Glu Lys His Phe Ala Ala Thr Gly 

125 120 135 

Lys Pro Leu Phe Ser Ser His Met He Asp Leu Ser Glu Glu Ser 

140 145 150 

Leu Gin Glu Asn He Glu He Cys Ser Lys Tyr Leu Glu Arg Met 

155 160 165 

Ser Lys He Gly Met Thr Leu Glu He Glu Leu Gly Cys Thr Gly 

170 175 180 

Gly Glu Glu Asp Gly Val Asp Asn Ser His Met Asp Ala Ser Ala 

185 190 195 

Leu Tyr Thr Gin Pro Glu Asp Val Asp Tyr Ala Tyr Thr Glu Leu 

200 205 210 
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Ser Lys He Ser Pro Arg Phe Thr He Ala Ala Ser Phe Gly Asn 
215 220 225 

Val His Gly Val Tyr Lys Pro Gly Asn Val Val Leu Thr Pro Thr 
5 230 235 240 

He Leu Arg Asp Ser Gin Glu Tyr Val Ser Lys Lys His Asn Leu 
245 250 255 

10 Pro His Asn Ser Leu Asn Phe Val Phe His Gly Gly Ser Gly Ser 

260 265 270 

Thr Ala Gin Glu He Lys Asp Ser Val Ser Tyr Gly Val Val Lys 
275 280 285 

15 

Met Asn He Asp Thr Asp Thr Gin Trp Ala Thr Trp Glu Gly Val 
290 295 300 



Leu Asn Tyr Tyr Lys Ala Asn Glu Ala Tyr Leu Gin Gly Gin Leu 

20 305 310 315 

Gly Asn Pro Lys Gly Glu Asp Gin Pro Asn Lys Lys Tyr Tyr Asp 

320 325 330 

25 Pro Arg Val Trp Leu Arg Ala Gly Gin Thr Ser Met He Ala Arg 

335 340 345 

Leu Glu Lys Ala Phe Gin Glu Leu Asn Ala He Asp Val Leu 

350 355 

30 



(2) INFORMATION FOR SEQ ID NO : 3 : 



(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: Linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GGGGCCATGG CTAAGATTTT TGATTTCGTA 
(2) INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 
CCCCGAGCTC TTACAGAACG TCGATCGCGT TCAG 
(2) INFORMATION FOR SEQ ID NO: 5: 



(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 10847 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: Linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 





1 


CGATAAGCTT 


GATGTAATTG 


GAGGAAGATC 


AAAATTTTCA 


ATCCCCATTC 




51 


TTCGATTGCT 


TCAATTGAAG 


TTTCTCCGAT 


GGCGCAAGTT 


AGCAGAATCT 


10 


101 


GCAATGGTGT 


GCAGAACCCA 


TCTCTTATCT 


CCAATCTCTC 


GAAATCCAGT 




151 


CAACGCAAAT 


CTCCCTTATC 


GGTTTCTCTG 


AAGACGCAGC 


AGCATCCACG 




201 


AGCTTATCCG 


ATTTCGTCGT 


CGTGGGGATT 


GAAGAAGAGT 


GGGATGACGT 




251 


TAATTGGCTC 


TGAGCTTCGT 


CCTCTTAAGG 


TCATGTCTTC 


TGTTTCCACG 




301 


GCGTGCATGC 


TTCACGGTGC 


AAGCAGCCGT 


CCAGCAACTG 


CTCGTAAGTC 


15 


351 


CTCTGGTCTT 


TCTGGAACCG 


TCCGTATTCC 


AGGTGACAAG 


TCTATCTCCC 




401 


ACAGGTCCTT 


CATGTTTGGA 


GGTCTCGCTA 


GCGGTGAAAC 


TCGTATCACC 




451 


GGTCTTTTGG 


AAGGTGAAGA 


TGTTATCAAC 


ACTGGTAAGG 


CTATGCAAGC 




501 


TATGGGTGCC 


AGAATCCGTA 


AGGAAGGTGA 


TACTTGGATC 


ATTGATGGTG 




551 


TTGGTAACGG 


TGGACTCCTT 


GCTCCTGAGG 


CTCCTCTCGA 


TTTCGGTAAC 


20 


601 


GCTGCAACTG 


GTTGCCGTTT 


GACTATGGGT 


CTTGTTGGTG 


TTTACGATTT 




651 


CGATAGCACT 


TTCATTGGTG 


ACGCTTCTCT 


CACTAAGCGT 


CCAATGGGTC 




701 


GTGTGTTGAA 


CCCACTTCGC 


GAAATGGGTG 


TGCAGGTGAA 


GTCTGAAGAC 




751 


GGTGATCGTC 


TTCCAGTTAC 


CTTGCGTGGA 


CCAAAGACTC 


CAACGCCAAT 




801 


CACCTACAGG 


GTACCTATGG 


CTTCCGCTCA 


AGTGAAGTCC 


GCTGTTCTGC 


25 


851 


TTGCTGGTCT 


CAACACCCCA 


GGTATCACCA 


CTGTTATCGA 


GCCAATCATG 




901 


ACTCGTGACC 


ACACTGAAAA 


GATGCTTCAA 


GGTTTTGGTG 


CTAACCTTAC 




951 


CGTTGAGACT 


GATGCTGACG 


GTGTGCGTAC 


CATCCGTCTT 


GAAGGTCGTG 




1001 


GTAAGCTCAC 


CGGTCAAGTG 


ATTGATGTTC 


CAGGTGATCC 


ATCCTCTACT 




1051 


GCTTTCCCAT 


TGGTTGCTGC 


CTTGCTTGTT 


CCAGGTTCCG 


ACGTCACCAT 


30 


1101 


CCTTAACGTT 


TTGATGAACC 


CAACCCGTAC 


TGGTCTCATC 


TTGACTCTGC 




1151 


AGGAAATGGG 


TGCCGACATC 


GAAGTGATCA 


ACCCACGTCT 


TGCTGGTGGA 




1201 


GAAGACGTGG 


CTGACTTGCG 


TGTTCGTTCT 


TCTACTTTGA 


AGGGTGTTAC 




1251 


TGTTCCAGAA 


GACCGTGCTC 


CTTCTATGAT 


CGACGAGTAT 


CCAATTCTCG 




1301 


CTGTTGCAGC 


TGCATTCGCT 


GAAGGTGCTA 


CCGTTATGAA 


CGGTTTGGAA 


35 


1351 


GAACTCCGTG 


TTAAGGAAAG 


CGACCGTCTT 


TCTGCTGTCG 


CAAACGGTCT 




1401 


CAAGCTCAAC 


GGTGTTGATT 


GCGATGAAGG 


TGAGACTTCT 


CTCGTCGTGC 




1451 


GTGGTCGTCC 


TGACGGTAAG 


GGTCTCGGTA 


ACGCTTCTGG 


AGCAGCTGTC 




1501 


GCTACCCACC 


TCGATCACCG 


TATCGCTATG 


AGCTTCCTCG 


TTATGGGTCT 




1551 


CGTTTCTGAA 


AACCCTGTTA 


CTGTTGATGA 


TGCTACTATG 


ATCGCTACTA 


40 


1601 


GCTTCCCAGA 


GTTCATGGAT 


TTGATGGCTG 


GTCTTGGAGC 


TAAGATCGAA 




1651 


CTCTCCGACA 


CTAAGGCTGC 


TTGATGAGCT 


CAAGAATTCG 


AGCTCGGTAC 




1701 


CGGATCCAGC 


TTTCGTTCGT 


ATCATCGGTT 


TCGACAACGT 


TCGTCAAGTT 




1751 


CAATGCATCA 


GTTTCATTGC 


GCACACACCA 


GAATCCTACT 


GAGTTCGAGT 




1801 


ATTATGGCAT 


TGGGAAAACT 


GTTTTTCTTG 


TACCATTTGT 


TGTGCTTGTA 


45 


1851 


ATTTACTGTG 


TTTTTTATTC 


GGTTTTCGCT 


ATCGAACTGT 


GAAATGGAAA 




1901 


TGGATGGAGA 


AGAGTTAATG 


AATGATATGG 


TCCTTTTGTT 


CATTCTCAAA 




1951 


TTAATATTAT 


TTGTTTTTTC 


TCTTATTTGT 


TGTGTGTTGA 


ATTTGAAATT 




. 2001 


ATAAGAGATA 


TGCAAACATT 


TTGTTTTGAG 


TAAAAATGTG 


TCAAATCGTG 




2051. 


GCCTCTAATG 


ACCGAAGTTA 


ATATGAGGAG 


TAAAACACTT 


GTAGTTGTAC 


50 


2101 


CATTATGCTT 


ATTCACTAGG 


CAACAAATAT 


ATTTTCAGAC 


CTAGAAAAGC 




2151 


TGCAAATGTT 


ACTGAATACA 


AGTATGTCCT 


CTTGTGTTTT 


AGACATTTAT 




2201 


GAACTTTCCT 


TTATGTAATT 


TTCCAGAATC 


CTTGTCAGAT 


TCTAATCATT 




2251 


GCTTTATAAT 


TATAGTTATA 


CTCATGGATT 


TGTAGTTGAG 


TATGAAAATA 




2301 


TTTTTTAATG 


CATTTTATGA 


CTTGCCAATT 


GATTGACAAC 


ATGCATCAAT 


55 


2351 


CGACCTGCAG 


CCACTCGAAG 


CGGCCGCGTT 


CAAGCTTGAG 


CTCAGGATTT 




2401 


AGCAGCATTC 


CAGATTGGGT 


TCAATCAACA AGGTACGAGC 


CATATCACTT 




2451 


TATTCAAATT 


GGTATCGCCA 


AAACCAAGAA GGAACTCCCA 


TCCTCAAAGG 




2501 


TTTGTAAGGA 


AGAATTCTCA 


GTCCAAAGCC 


TCAACAAGGT 


CAGGGTACAG 
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AGTCTCCAAA 
AATCAAAGTA 
AGAAAAAGAC 
AATCTTGTCA 
ATGGTGCAGA 
GCAAAGATAA 
GAAAAGAGCT 
TGACGACCAC 
TGAAGGATCA 
GCTTCCTCTA 
CACTATGGTC 
CCACCCGCAA 
AGAGTTAACT 
GACTCTCTCT 
GCATGCAGGC 
ACTGGTGATG 
CGCACTGCCA 
TGGAAACCGC 
GGTGGTGCTT 
GGGTGCTGCT 
TGGCTGAACA 
AAGAAACTGC 
CTTCGCAGCT 
CTGAAGAATC 
CGCATGTCCA 
TGGTGAAGAA 
ACACCCAGCC 
AGCCCGCGTT 
CAAGCCGGGT 
AATATGTTTC 
TTCCACGGTG 
CTACGGCGTA 
GGGAAGGCGT 
CAGCTGGGTA 
TCCGCGCGTA 
AGAAAGCATT 
ACCGGATCCA 
GATTGAATCC 
AATTACGTTA 
GAGATGGGTT 
TAGAAAACAA 
TCATCTATGT 
GTGGTGGCCG 
GTGAATGTAG 
TATTGCTTTC 
AATGTACTTT 
GAAAAAAAAT 
CTCATTGCTG 
AATATATCCT 
GGTTTCTACG 
TGAGCCATGT 
TGATCCACAT 
CGTTCGCGCG 
GGGTTCGAGA 
GGGCGCAGCC 
GGTTAAAAGA 
AATGCTGGAT 
CCCCTCATCT 



CCATTAGCCA 
AACTACTGTT 
ATCCACCGAA 
ACATCGAGCA 
ATTGTTAGGC 
AGCAGATTCC 
GTCCTGACAG 
AAAAGAATTC 
TCAGATACTG 
TGCTCTCTTC 
GCTCCTTTCA 
GGCTAACAAC 
GCATGCAGGT 
TACCTTCCTG 
CATGGCTAAG 
ACGTACAGAA 
GCAGTAAACT 
TGCTAAAGTT 
CCTTTATCGC 
ATCCTGGGCG 
TTATGGTGTT 
TGCCGTGGAT 
ACCGGTAAGC 
TCTGCAAGAG 
AAATCGGCAT 
GACGGCGTGG 
GGAAGACGTT 
TCACCATCGC 
AACGTGGTTC 
CAAGAAACAC 
GTTCCGGTTC 
GTAAAAATGA 
TCTGAACTAC 
ACCCGAAAGG 
TGGCTGCGTG 
CCAGGAACTG 
ATTCCCGATC 
TGTTGCCGGT 
AGCATGTAAT 
TTTATGATTA 
AATATAGCGC 
TACTAGATCG 
CATCGATCGT 
ACACGTCGAA 
GCCTATAAAT 
CATTTTATAA 
TGGTAATTAC 
ATCCATGTAG 
GCCGCCGCTG 
CAGAACTGAG 
GCACCTTCCC 
GGGACTTTTc 
GGGCGCCAGC 
AGGGGGGGCA 
CTGGTTAAAA 
CAGGTTAGCG 
TTTCTGCCTG 
GTCATCACTC 



AAAGCTACAG 
CCAGCACATG 
GACTTAAAGT 
GCTGGCTTGT 
GCACCTACCA 
TCTAGTACAA 
CCCACTCACT 
CCTCTATATA 
AACCAATCCT 
CGCTACTATG 
ACGGACTTAA 
GACATTACTT 
GTGGCCTCCG 
ACCTTACCGA 
ATTTTTGATT 
AGTTTTCCAG 
GCGTCGGTAC 
AAAGCGCCGG 
TGGTAAAGGC 
CGATCTCTGG 
CCGGTTATCC 
CGACGGTCTG 
CGCTGTTCTC 
AACATCGAAA 
GACTCTGGAA 
ACAACAGCCA 
GATTACGCAT 
AGCGTCCTTC 
TGACTCCGAC 
AACCTGCCGC 
TACTGCTCAG 
ACATCGATAC 
TACAAAGCGA 
CGAAGATCAG 
CCGGTCAGAC 
AACGCGATCG 
GTTCAAACAT 
CTTGCGATGA 
AATTAACATG 
GAGTCCCGCA 
GCAAACTAGG 
GGGATCGATC 
GAAGTTTCTC 
ATAAAGATTT 
ACGACGGATC 
TAACGCTGCG 
TCTTTCTTTT 
ATTTCCCGGA 
CCGCTTTGCA 
CCGGTTAGGC 
CCCAACACGG 
CTAGCTTGGC 
TGGGGGGATG 
CCCCCCTTCG 
ACAAGGTTTA 
GTGGCCGAAA 
TGGACAGCCC 
TGCCCCTCAA 



GAGATCAATG 
CATCATGGTC 
TAGTGGGCAT 
GGGGACCAGA 
AAAGCATCTT 
GTGGGGAACA 
AATGCGTATG 
AGAAGGCATT 
TCTAGAAGAT 
GTTGCCTCTC 
GTCCTCCGCT 
CCATCACAAG 
ATTGGAAAGA 
TTCCGGTGGT 
TCGTAAAACC 
GTAGCAAAAG 
TGACTCCATC 
TTATCGTTCA 
GTGAAATCTG 
TGCGCATCAC 
TGCACACTGA 
TTGGACGCGG 
TTCTCACATG 
TCTGCTCTAA 
ATCGAACTGG 
CATGGACGCT 
ACACCGAACT 
GGTAACGTAC 
CATCCTGCGT 
ACAACAGCCT 
GAAATCAAAG 
CGATACCCAA 
ACGAAGCTTA 
CCGAACAAGA 
TTCGATGATC 
ACGTTCTGTA 
TTGGCAATAA 
TTATCATATA 
TAATGCATGA 
ATTATACATT 
ATAAATTATC 
CCCGGGCGGC 
ATCTAAGCCC 
CCGAATTAGA 
GTAATTTGTC 
GACATCTACA 
TCTCCATATT 
CATGAAGCCA 
CCCGGTGGAG 
AGATAATTTC 
TGAGCGACGG 
TGCCATTTTT 
GGAGGCCCGC 
GCGTGCGCGG 
TAAATATTGG 
AACGGGCGGA 
CTCAAATGTC 
GTGTCAAGGA 



AAGAATCTTC 
AGTAAGTTTC 
CTTTGAAAGT 
CAAAAAAGGA 
TGCCTTTATT 
AAATAACGTG 
ACGAACGCAG 
CATTCCCATT 
CTCCACAATG 
CGGCTCAGGC 
GCCTTCCCAG 
CAACGGCGGA 
AGAAGTTTGA 
CGCGTCAACT 
TGGCGTAATC 
AAAACAACTT 
AACGCCGTAC 
GTTCTCCAAC 
ACGTTCCGCA 
GTTCACCAGA 
CCACTGCGCG 
GTGAAAAACA 
ATCGACCTGT 
ATACCTGGAG 
GTTGCACCGG 
TCTGCACTGT 
GAGCAAAATC 
ACGGTGTTTA 
GATTCTCAGG 
GAACTTCGTA 
ACTCCGTAAG 
TGGGCAACCT 
TCTGCAGGGT 
AATACTACGA 
GCTCGTCTGG 
AGAGCTCGGT 
AGTTTCTTAA 
ATTTCTGTTG 
CGTTATTTAT 
TAATACGCGA 
GCGCGCGGTG 
CGCCACTCGA 
CCATTTGGAC 
ATAATTTGTT 
GTTTTATCAA 
TTTTTGAATT 
GACCATCATA 
TTTACAATTG 
CTTGCATGTT 
CATTGAGAAC 
GGCAACGGAG 
GGGGTGAGGC 
GTTACCGGGA 
TCACGCGCCA 
TTTAAAAGCA 
AACCCTTGCA 
AATAGGTGCG 
TCGCGCCCCT 
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CATCTGTCAG 
CCCCAGGCTT 
TTTCGCCGAT 
GCCTGCCCCT 
GTCAACGTCC 
ATCCACAACG 
CGTTTCTGGC 
GGGCAACCAG 
GAGAGCCTTC 
TCGTCGCCGC 
GTGCCGGCAG 
CGCGACGATG 
TCGCTCAAGC 
CAGGCCATTA 
GGCGTTCGCG 
CTTCCGGCGG 
GTAGATGACG 
CAGCCTAACT 
CCTCGGCGAG 
TACCTTGTCT 
CTCGACCTGA 
AAGAATTGGA 
ACCCTTGGCA 
CGGCGCATCT 
GCTCCTGTCG 
AGCAGAATGA 
AAAACGTCTG 
TTCGTAAAGT 
GGATCTGCAT 
GTATTAACGA 
CCGCATCCAT 
CATGTTCATC 
GGTATCATTA 
GTGACCAAAC 
CCAGACATTA 
AGGCAGACAT 
AGCTGCCTCG 
GCTCCCGGAG 
AAGCCCGTCA 
ATGACCCAGT 
GCATCAGAGC 
GCACAGATGC 
CGCTCACTGA 
GCTCACTCAA 
AGGAAAGAAC 
AGGCCGCGTT 
C AC AAAAAT C 
AAGATACCAG 
CGACCCTGCC 
GTGGCGCTTT 
CGTTCGCTCC 
GCTGCGCCTT 
GACTTATCGC 
GTATGTAGGC 
ACACTAGAAG 
TTCGGAAAAA 
TAGCGGTGGT 
GATCTCAAGA 



TAGTCGCGCC 
GTCCACATCA 
TTGCGAGGCT 
CATCTGTCAA 
GCCCCTCATC 
CCGGCGGCCG 
GCGTTTGCAG 
CCCGGTGAGC 
AACCCAGTCA 
ACTTATGACT 
CGCTCTGGGT 
ATCGGCCTGT 
CTTCGTCACT 
TCGCCGGCAT 
ACGCGAGGCT 
CATCGGGATG 
ACCATCAGGG 
TCGATCACTG 
CACATGGAAC 
GCCTCCCCGC 
ATGGAAGCCG 
GCCAATCAAT 
GAACATATCC 
CGGGCAGCGT 
TTGAGGACCC 
ATCACCGATA 
CGACCTGAGC 
CTGGAAACGC 
CGCAGGATGC 
AGCGCTGGCA 
ACCGCCAGTT 
ATCAGTAACC 
CCCCCATGAA 
AGGAAAAAAC 
ACGCTTCTGG 
CTGTGAATCG 
CGCGTTTCGG 
ACGGTCACAG 
GGGCGCGTCA 
CACGTAGCGA 
AGATTGTACT 
GTAAGGAGAA 
CTCGCTGCGC 
AGGCGGTAAT 
ATGTGAGCAA 
GCTGGCGTTT 
GACGCTCAAG 
GCGTTTCCCC 
GCTTACCGGA 
CTCATAGCTC 
AAGCTGGGCT 
ATCCGGTAAC 
CACTGGCAGC 
GGTGCTACAG 
GACAGTATTT 
GAGTTGGTAG 
TTTTTTGTTT 
AGATCCTTTG 



CCTCAAGTGT 
TCTGTGGGAA 
GGCCAGCTCC 
CGCCGCGCCG 
TGTCAGTGAG 
GCCGCGGTGT 
GGCCATAGAC 
GTCGGAAAGG 
GCTCCTTCCG 
GTCTTCTTTA 
CATTTTCGGC 
CGCTTGCGGT 
GGTCCCGCCA 
GGCGGCCGAC 
GGATGGCCTT 
CCCGCGTTGC 
AC AG CTTC AA 
GACCGCTGAT 
GGGTTGGCAT 
GTTGCGTCGC 
GCGGCACCTC 
TCTTGCGGAG 
ATCGCGTCCG 
TGGGTCCTGG 
GGCTAGGCTG 
CGCGAGCGAA 
AACAACATGA 
GGAAGTCAGC 
TGCTGGCTAC 
TTGACCCTGA 
GTTTACCCTC 
CGTATCGTGA 
CAGAAATTCC 
CGCCCTTAAC 
AGAAACTCAA 
CTTCACGACC 
TGATGACGGT 
CTTGTCTGTA 
GCGGGTGTTG 
TAGCGGAGTG 
GAGAGTGCAC 
AATACCGCAT 
TCGGTCGTTC 
ACGGTTATCC 
AAGGCCAGCA 
TTCCATAGGC 
TCAGAGGTGG 
CTGGAAGCTC 
TACCTGTCCG 
ACGCTGTAGG 
GTGTGCACGA 
TATCGTCTTG 
AGCCACTGGT 
AGTTCTTGAA 
GGTATCTGCG 
CTCTTGATCC 
GCAAGCAGCA 
ATCTTTTCTA 
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CAATACCGCA 
ACTCGCGTAA 
ACGTCGCCGG 
GGTGAGTCGG 
GGCCAAGTTT 
CTCGCACACG 
GGCCGCCAGC 
GTCGATCGAC 
GTGGGCGCGG 
TCATGCAACT 
GAGGACCGCT 
ATTCGGAATC 
CCAAACGTTT 
GCGCTGGGCT 
CCCCATTATG 
AGGCCATGCT 
GGATCGCTCG 
CGTCACGGCG 
GGATTGTAGG 
GGTGCATGGA 
GCTAACGGAT 
AACTGTGAAT 
CCATCTCCAG 
CCACGGGTGC 
GCGGGGTTGC 
CGTGAAGCGA 
ATGGTCTTCG 
GCCCTGCACC 
CCTGTGGAAC 
GTGATTTTTC 
ACAACGTTCC 
GCATCCTCTC 
CCCTTACACG 
ATGGCCCGCT 
CGAGCTGGAC 
ACGCTGATGA 
GAAAACCTCT 
AGCGGATGCC 
GCGGGTGTCG 
TATACTGGCT 
CATATGCGGT 
CAGGCGCTCT 
GGCTGCGGCG 
ACAGAATCAG 
AAAGGCCAGG 
TCCGCCCCCC 
CGAAACCCGA 
CCTCGTGCGC 
CCTTTCTCCC 
TATCTCAGTT 
ACCCCCCGTT 
AGTCCAACCC 
AACAGGATTA 
GTGGTGGCCT 
CTCTGCTGAA 
GGCAAACAAA 
GATTACGCGC 
CGGGGTCTGA 



GGGCACTTAT 
AATCAGGCGT 
CCGAAATCGA 
CCCCTCAAGT 
TCCGCGTGGT 
GCTTCGACGG 
CCAGCGGCGA 
CGATGCCCTT 
GGCATGACTA 
CGTAGGACAG 
TTCGCTGGAG 
TTGCACGCCC 
CGGCGAGAAG 
ACGTCTTGCT 
ATTCTTCTCG 
GTCCAGGCAG 
CGGCTCTTAC 
ATTTATGCCG 
CGCCGCCCTA 
GCCGGGCCAC 
TCACCACTCC 
GCGCAAACCA 
CAGCCGCACG 
GCATGATCGT 
CTTACTGGTT 
CTGCTGCTGC 
GTTTCCGTGT 
ATTATGTTCC 
ACCTACATCT 
TCTGGTCCCG 
AGTAACCGGG 
TCGTTTCATC 
GAGGCATCAA 
TTATCAGAAG 
GCGGATGAAC 
GCTTTACCGC 
GACACATGCA 
GGGAGCAGAC 
GGGCGCAGCC 
TAACTATGCG 
GTGAAATACC 
TCCGCTTCCT 
AGCGGTATCA 
GGGATAACGC 
AACCGTAAAA 
TGACGAGCAT 
CAGGACTATA 
TCTCCTGTTC 
TTCGGGAAGC 
CGGTGTAGGT 
CAGCCCGACC 
GGTAAGACAC 
GCAGAGCGAG 
AACTACGGCT 
GCCAGTTACC 
CCACCGCTGG 
AGAAAAAAAG 
CGCTCAGTGG 
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AACGAAAACT 
CTTCACCTAG 
GTATATATGA 
GCACCTATCT 
CCCCGTCGTG 
GTGCTGCAAT 
GCAATAAACC 
TTTATCCGCC 
GTAGTTCGCC 
CGGGAGCACA 
CGGCGCGGCT 
CGAAGTATCG 
TCGAACCGAC 
GGCCTGAAGC 
GCTTGATGAA 
CGGCTTCCCC 
GTTGTGCACG 
GCAATTTGGA 
CAGCCACGAT 
CATAGCGTTG 
TCCTGAACAG 
ACTCGCCGCC 
TCCCGCATTT 
CGCTGCCGAC 
TACTTGAAGC 
CGCGCAGATC 
CAAGGTAGTC 
TCGCGGCGCG 
ACGCTCGTCG 
GGCGAGTTAC 
GGTCCTCCGA 
TGAGGGATCA 
GCCAAGGGAT 
TGGGTGGTTG 
GGGGAACCCT 
TTTTCACGCC 
TAGGTTTACC 
GGCGGGAAAC 
AGCATTCCAG 
TCAAATTGGT 
GTAAGGAAGA 
CTCCAAACCA 
CAAAGTAAAC 
AAAAGACATC 
CTTGTCAACA 
GTGCAGAATT 
AAGATAAAGC 
AAGAGCTGTC 
CGACCACAAA 
AGGATCATCA 



CACGTTAAGG 
ATCCTTTTAA 
GTAAACTTGG 
CAGCGATCTG 
TAGATAACTA 
GATACCGCGA 
AGCCAGCCGG 
TCCATCCAGT 
AGTTAATAGT 
GGATGACGCC 
TAATTCAGGA 
ACTCAACTAT 
GTTGCTGGCC 
CACACAGTGA 
ACAACGCGGC 
TGGAGAGAGC 
ACGACATCAT 
GAATGGCAGC 
CGACATTGAT 
CCTTGGTAGG 
GATCTATTTG 
CGACTGGGCT 
GGTACAGCGC 
TGGGCAATGG 
TAGGCAGGCT 
AGTTGGAAGA 
GGCAAATAAT 
GCTTAACTCA 
TTTGGTATGG 
ATGATCCCCC 
TCGAGGATTT 
AGCCACAGCA 
CTTTTTGGAA 
AACAGAAGTC 
GTGGTTGGCA 
CTTTTAAATA 
CGCCAATATA 
GACAATCTGA 
ATTGGGTTCA 
ATCGCCAAAA 
ATTCTCAGTC 
TTAGCCAAAA 
TACTGTTCCA 
CACCGAAGAC 
TCGAGCAGCT 
GTTAGGCGCA 
AGATTCCTCT 
CTGACAGCCC 
AGAATTCCCT 
GATACTGAAC 



GATTTTGGTC 
ATTAAAAATG 
TCTGACAGTT 
TCTATTTCGT 
CGATACGGGA 
GACCCACGCT 
AAGGGCCGAG 
CTATTAATTG 
TTGCGCAACG 
TAACAATTCA 
GTTAAACATC 
CAGAGGTAGT 
GTACATTTGT 
TATTGATTTG 
GAGCTTTGAT 
GAGATTCTCC 
TCCGTGGCGT 
G C AATG AC AT 
CTGGCTATCT 
TCCAGCGGCG 
AGGCGCTAAA 
GGCGATGAGC 
AGTAACCGGC 
AGCGCCTGCC 
TAT CTTGG AC 
ATTTGTTCAC 
GTCTAACAAT 
AGCGTTAGAT 
CTTCATTCAG 
ATGTTGTGCA 
TTCGGCGCTG 
GCCCACTCGA 
TGCTGCTCCG 
ATTATCGTAC 
TGCACATACA 
TCCGTTATTC 
TCCTGTCAAA 
TCCCCATCAA 
ATCAACAAGG 
CCAAGAAGGA 
CAAAGCCTCA 
GCTACAGGAG 
GCACATGCAT 
TTAAAGTTAG 
GGCTTGTGGG 
CCTACCAAAA 
AGTACAAGTG 
ACTCACTAAT 
CTATATAAGA 
CAATCCTTCT 



ATGAGATTAT 
AAGTTTTAAA 
ACCAATGCTT 
TCATCCATAG 
GGGCTTACCA 
CACCGGCTCC 
CGCAGAAGTG 
TTGCCGGGAA 
TTGTTGCCAT 
TTCAAGCCGA 
ATGAGGGAAG 
TGGCGTCATC 
ACGGCTCCGC 
CTGGTTACGG 
CAACGACCTT 
GCGCTGTAGA 
TATCCAGCTA 
TCTTGCAGGT 
TGCTGACAAA 
GAGGAACTCT 
TGAAACCTTA 
GAAATGTAGT 
AAAATCGCGC 
GGCCCAGTAT 
AAGAAGATCG 
TACGTGAAAG 
TCGTTCAAGC 
GCTGCAGGCA 
CTCCGGTTCC 
AAAAAGCGGT 
CGCTACGTCC 
CCTCTAGCCG 
TCGTCAGGCT 
GGAATGCCAA 
AATGGACGAA 
TAATAAACGC 
CACTGATAGT 
GCTTGAGCTC 
TACGAGCCAT 
ACTCCCATCC 
ACAAGGTCAG 
ATCAATGAAG 
CATGGTCAGT 
TGGGCATCTT 
GACCAGACAA 
GCATCTTTGC 
GGGAACAAAA 
GCGTATGACG 
AGGCATTCAT 
AGAAGATCTA 



CAAAAAGGAT 
TCAATCTAAA 
AATCAGTGAG 
TTGCCTGACT 
TCTGGCCCCA 
AGATTTATCA 
GTCCTGCAAC 
GCTAGAGTAA 
TGCTGCAGGT 
CACCGCTTCG 
CGGTGATCGC 
GAGCGCCATC 
AGTGGATGGC 
TGACCGTAAG 
TTGGAAACTT 
AGTCACCATT 
AGCGCGAACT 
ATCTTCGAGC 
AGCAAGAGAA 
TTGATCCGGT 
ACGCTATGGA 
GCTTACGTTG 
CGAAGGATGT 
CAGCCCGTCA 
CTTGGCCTCG 
GCGAGATCAC 
CGACGCCGCT 
TCGTGGTGTC 
CAACGATCAA 
TAGCTCCTTC 
GCKACCGCGT 
ACCCAGACGA 
TTCCGACGTT 
GCACTCCCGA 
CGGATAAACC 
TCTTTTCTCT 
TTAAACTGAA 
AGGATTTAGC 
ATCACTTTAT 
TCAAAGGTTT 
GGTACAGAGT 
AATCTTCAAT 
AAGTTTCAGA 
TGAAAGTAAT 
AAAAGGAATG 
CTTTATTGCA 
TAACGTGGAA 
AACGCAGTGA 
TCCCATTTGA 
AGCTTAT 



(2) INFORMATION FOR SEQ ID NO: 6: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10901 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: Linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 





1 


CGATAAGCTT 


GATGTAATTG 


GAGGAAGATC 


AAAATTTTCA 


ATCCCCATTC 


5 


51 


TTCGATTGCT 


TCAATTGAAG 


TTTCTCCGAT 


GGCGCAAGTT 


AGCAGAATCT 




101 


GCAATGGTGT 


GCAGAACCCA 


TCTCTTATCT 


CCAATCTCTC 


GAAATCCAGT 




151 


CAACGCAAAT 


CTCCCTTATC 


GGTTTCTCTG 


AAGACGCAGC 


AGCATCCACG 




201 


AGCTTATCCG 


ATTTCGTCGT 


CGTGGGGATT 


GAAGAAGAGT 


GGGATGACGT 




251 


TAATTGGCTC 


TGAGCTTCGT 


CCTCTTAAGG 


TCATGTCTTC 


TGTTTCCACG 


10 


301 


GCGTGCATGC 


TTCACGGTGC 


AAGCAGCCGT 


CCAGCAACTG 


CTCGTAAGTC 




351 


CTCTGGTCTT 


TCTGGAACCG 


TCCGTATTCC 


AGGTGACAAG 


TCTATCTCCC 




401 


ACAGGTCCTT 


CATGTTTGGA 


GGTCTCGCTA 


GCGGTGAAAC 


TCGTATCACC 




451 


GGTCTTTTGG 


AAGGTGAAGA 


TGTTATCAAC 


ACTGGTAAGG 


CTATGCAAGC 




501 


TATGGGTGCC 


AGAATCCGTA 


AGGAAGGTGA 


TACTTGGATC 


ATTGATGGTG 


15 


551 


TTGGTAACGG 


TGGACTCCTT 


GCTCCTGAGG 


CTCCTCTCGA 


TTTCGGTAAC 




601 


GCTGCAACTG 


GTTGCCGTTT 


GACTATGGGT 


CTTGTTGGTG 


TTTACGATTT 




651 


CGATAGCACT 


TTCATTGGTG 


ACGCTTCTCT 


CACTAAGCGT 


CCAATGGGTC 




701 


GTGTGTTGAA 


CCCACTTCGC 


GAAATGGGTG 


TGCAGGTGAA 


GTCTGAAGAC 




751 


GGTGATCGTC 


TTCCAGTTAC 


CTTGCGTGGA 


CCAAAGACTC 


CAACGCCAAT 


20 


801 


CACCTACAGG 


GTACCTATGG 


CTTCCGCTCA 


AGTGAAGTCC 


GCTGTTCTGC 




851 


TTGCTGGTCT 


CAACACCCCA 


GGTATCACCA 


CTGTTATCGA 


GCCAATCATG 




901 


ACTCGTGACC 


ACACTGAAAA 


GATGCTTCAA 


GGTTTTGGTG 


CTAACCTTAC 




951 


CGTTGAGACT 


GATGCTGACG 


GTGTGCGTAC 


CATCCGTCTT 


GAAGGTCGTG 




1001 


GTAAGCTCAC 


CGGTCAAGTG 


ATTGATGTTC 


CAGGTGATCC 


ATCCTCTACT 


25 


1051 


GCTTTCCCAT 


TGGTTGCTGC 


CTTGCTTGTT 


CCAGGTTCCG 


ACGTCACCAT 




1101 


CCTTAACGTT 


TTGATGAACC 


CAACCCGTAC 


TGGTCTCATC 


TTGACTCTGC 




1151 


AGGAAATGGG 


TGCCGACATC 


GAAGTGATCA 


ACCCACGTCT 


TGCTGGTGGA 




1201 


GAAGACGTGG 


CTGACTTGCG 


TGTTCGTTCT 


TCTACTTTGA 


AGGGTGTTAC 




1251 


TGTTCCAGAA 


GACCGTGCTC 


CTTCTATGAT 


CGACGAGTAT 


CCAATTCTCG 


30 


1301 


CTGTTGCAGC 


TGCATTCGCT 


GAAGGTGCTA 


CCGTTATGAA 


CGGTTTGGAA 




1351 


GAACTCCGTG 


TTAAGGAAAG 


CGACCGTCTT 


TCTGCTGTCG 


CAAACGGTCT 




1401 


CAAGCTCAAC 


GGTGTTGATT 


GCGATGAAGG 


TGAGACTTCT 


CTCGTCGTGC 




1451 


GTGGTCGTCC 


TGACGGTAAG 


GGTCTCGGTA 


ACGCTTCTGG 


AGCAGCTGTC 




1501 


GCTACCCACC 


TCGATCACCG 


TATCGCTATG 


AGCTTCCTCG 


TTATGGGTCT 


35 


1551 


CGTTTCTGAA 


AACCCTGTTA 


CTGTTGATGA 


TGCTACTATG 


ATCGCTACTA 




1601 


GCTTCCCAGA 


GTTCATGGAT 


TTGATGGCTG 


GTCTTGGAGC 


TAAGATCGAA 




1651 


CTCTCCGACA 


CTAAGGCTGC 


TTGATGAGCT 


CAAGAATTCG 


AGCTCGGTAC 




1701 


CGGATCCAGC 


TTTCGTTCGT 


ATCATCGGTT 


TCGACAACGT 


TCGTCAAGTT 




1751 


CAATGCATCA 


GTTTCATTGC 


GCACACACCA 


GAATCCTACT 


GAGTTCGAGT 


40 


1801 


ATTATGGCAT 


TGGGAAAACT 


GTTTTTCTTG 


TACCATTTGT 


TGTGCTTGTA 




1851 


ATTTACTGTG 


TTTTTTATTC 


GGTTTTCGCT 


ATCGAACTGT 


GAAATGGAAA 




1901 


TGGATGGAGA 


AGAGTTAATG 


AATGATATGG 


TCCTTTTGTT 


CATTCTCAAA 




1951 


TTAATATTAT 


TTGTTTTTTC 


TCTTATTTGT 


TGTGTGTTGA 


ATTTGAAATT 




2001 


ATAAGAGATA 


TGCAAACATT 


TTGTTTTGAG 


TAAAAATGTG 


TCAAATCGTG 


45 


2051 


GCCTCTAATG 


AC CG AAGTTA 


ATATGAGGAG 


TAAAACACTT 


GTAGTTGTAC 




2101 


CATTATGCTT 


ATTCACTAGG 


CAACAAATAT 


ATTTTCAGAC 


CTAGAAAAGC 




2151 


TGCAAATGTT 


ACTGAATACA 


AGTATGTCCT 


CTTGTGTTTT 


AGACATTTAT 




2201 


GAACTTTCCT 


TTATGTAATT 


TTCCAGAATC 


CTTGTCAGAT 


TCTAATCATT 




2251 


GCTTTATAAT 


TATAGTTATA 


CTCATGGATT 


TGTAGTTGAG 


TATGAAAATA 


50 


2301 


TTTTTTAATG 


CATTTTATGA 


CTTGCCAATT 


GATTGACAAC 


ATGCATCAAT 




2351 


CGACCTGCAG 


CCACTCGAAG 


CGGCCGCGTT 


CAAGCTTGAG 


CTCAGGATTT 




2401 


AGCAGCATTC 


CAGATTGGGT 


TCAATCAACA AGGTACGAGC 


CATATCACTT 




2451 


TATTCAAATT 


GGTATCGCCA AAACCAAGAA GGAACTCCCA TCCTCAAAGG 




2501 


TTTGTAAGGA AGAATTCTCA GTCCAAAGCC 


TCAACAAGGT 


CAGGGTACAG 


55 


2551 


AGTCTCCAAA 


CCATTAGCCA 


AAAGCTACAG 


GAGATCAATG 


AAGAATCTTC 




2601 


AATCAAAGTA AACTACTGTT 


CCAGCACATG 


CATCATGGTC 


AGTAAGTTTC 




2651 


AGAAAAAGAC 


ATCCACCGAA 


GACTTAAAGT 


TAGTGGGCAT 


CTTTGAAAGT 




2701 


AATCTTGTCA 


ACATCGAGCA 


GCTGGCTTGT 


GGGGACCAGA 


CAAAAAAGGA 
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2751 


ATGGTGCAGA 


ATTGTTAGGC 




2801 


GCAAAGATAA 


AGCAGATTCC 




2851 


GAAAAGAGCT 


GTCCTGACAG 




2901 


TGACGACCAC 


AAAAGAATTC 


5 


2951 


TGAAGGATCA 


TCAGATACTG 




3001 


CGATAAGCTT 


GATGTAATTG 




3051 


TTCGATTGCT 


TCAATTGAAG 




3101 


GCAATGGTGT 


GCAGAACCCA 




3151 


CAACGCAAAT 


CTCCCTTATC 


10 


3201 


AGCTTATCCG 


ATTTCGTCGT 




3251 


TAATTGGCTC 


TGAGCTTCGT 




3301 


GCGTGCATGC 


AGGCcatggC 




3351 


AATCACTGGT 


GATGACGTAC 




3401 


ACTTCGCACT 


GCCAGCAGTA 


15 


3451 


GTACTGGAAA 


CCGCTGCTAA 




3501 


CAACGGTGGT 


GCTTCCTTTA 




3551 


CGCAGGGTGC 


TGCTATCCTG 




3601 


CAGATGGCTG 


AACATTATGG 




3651 


CGCGAAGAAA 


CTGCTGCCGT 


20 


3701 


AACACTTCGC 


AGCTACCGGT 




3751 


CTGTCTGAAG 


AATCTCTGCA 




3801 


GGAGCGCATG 


TCCAAAATCG 




3851 


CCGGTGGTGA 


AGAAGACGGC 




3901 


CTGTACACCC 


AGCCGGAAGA 


25 


3951 


AATCAGCCCG 


CGTTTCACCA 




4001 


TTTACAAGCC 


GGGTAACGTG 




4051 


CAGGAATATG 


TTTCCAAGAA 




4101 


CGTATTCCAC 


GGTGGTTCCG 




4151 


TAAGCTACGG 


CGTAGTAAAA 


30 


4201 


ACCTGGGAAG 


GCGTTCTGAA 




4251 


GGGTCAGCTG 


GGTAACCCGA 




4301 


ACGATCCGCG 


CGTATGGCTG 




4351 


CTGGAGAAAG 


CATTCCAGGA 




4401 


CGGTACCGGA 


TCCAATTccc 


35 


4451 


TTAAGATTGA 


ATCCTGTTGC 




4501 


GTTGAATTAC 


GTTAAGCATG 




4551 


TTATGAGATG 


GGTTTTTATG 




4601 


GCGATAGAAA 


ACAAAATATA 




4651 


GGTGTCATCT 


ATGTTACTAG 


40 


4701 


TCGAGTGGTG 


GCCGCATCGA 




4751 


GGACGTGAAT 


GTAGACACGT 




4801 


TGTTTATTGC 


TTTCGCCTAT 




4851 


TCAAAATGTA 


CTTTCATTTT 




4901 


AATTGAAAAA 


AAATTGGTAA 


45 


4951 


CATACTCATT 


GCTGATCCAT 




5001 


ATTGAATATA 


TCCTGCCGCC 




5051 


TGTTGGTTTC 


TACGCAGAAC 




5101 


GAACTGAGCC 


ATGTGCACCT 




5151 


GGAGTGATCC 


ACATGGGACT 


50 


5201 


AGGCCGTTCG 


CGCGGGGCGC 




5251 


GGGAGGGTTC 


GAGAAGGGGG 




5301 


GCCAGGGCGC 


AGCCCTGGTT 




5351 


AGCAGGTTAA 


AAGACAGGTT 




5401 


TGCAAATGCT 


GGATTTTCTG 


55 


5451 


TGCGCCCCTC 


ATCTGTCATC 




5501 


CCCTCATCTG 


TCAGTAGTCG 




5551 


TTATCCCCAG 


GCTTGTCCAC 




5601 


GCGTTTTCGC 


CGATTTGCGA 



GCACCTACCA AAAGCATCTT TGCCTTTATT 
TCTAGTACAA GTGGGGAACA AAATAACGTG 
CCCACTCACT AATGCGTATG ACGAACGCAG 
CCTCTATATA AGAAGGCATT CATTCCCATT 
AACCAATCCT TCTAGAAGAT CTAAGCTTAT 
GAGGAAGATC AAAATTTTCA ATCCCCATTC 
TTTCTCCGAT GGCGCAAGTT AGCAGAATCT 
TCTCTTATCT CCAATCTCTC GAAATCCAGT 
GGTTTCTCTG AAGACGCAGC AGCATCCACG 
CGTGGGGATT GAAGAAGAGT GGGATGACGT 
CCTCTTAAGG TCATGTCTTC TGTTTCCACG 
TAAGATTTTT GATTTCGTAA AACCTGGCGT 
AGAAAGTTTT CCAGGTAGCA AAAGAAAACA 
AACTGCGTCG GTACTGACTC CATCAACGCC 
AGTTAAAGCG CCGGTTATCG TTCAGTTCTC 
TCGCTGGTAA AGGCGTGAAA TCTGACGTTC 
GGCGCGATCT CTGGTGCGCA TCACGTTCAC 
TGTTCCGGTT ATCCTGCACA CTGACCACTG 
GGATCGACGG TCTGTTGGAC GCGGGTGAAA 
AAGCCGCTGT TCTCTTCTCA CATGATCGAC 
AGAGAACATC GAAATCTGCT CTAAATACCT 
GCATGACTCT GGAAATCGAA CTGGGTTGCA 
GTGGACAACA GCCACATGGA CGCTTCTGCA 
CGTTGATTAC GCATACACCG AACTGAGCAA 
TCGCAGCGTC CTTCGGTAAC GTACACGGTG 
GTTCTGACTC CGACCATCCT GCGTGATTCT 
ACACAACCTG CCGCACAACA GCCTGAACTT 
GTTCTACTGC TCAGGAAATC AAAGACTCCG 
ATGAACATCG ATACCGATAC CCAATGGGCA 
CTACTACAAA GCGAACGAAG CTTATCTGCA 
AAGGCGAAGA TCAGCCGAAC AAGAAATACT 
CGTGCCGGTC AGACTTCGAT GATCGCTCGT 
ACTGAACGCG ATCGACGTTC TGTAAGAGCT 
GATCGTTCAA ACATTTGGCA ATAAAGTTTC 
CGGTCTTGCG ATGATTATCA TATAATTTCT 
TAATAATTAA CATGTAATGC ATGACGTTAT 
ATTAGAGTCC CGCAATTATA CATTTAATAC 
GCGCGCAAAC TAGGATAAAT TATCGCGCGC 
ATCGGGGATC GATCCCCGGG CGGCCGCCAC 
TCGTGAAGTT TCTCATCTAA GCCCCCATTT 
CGAAATAAAG ATTTCCGAAT TAGAATAATT 
AAATACGACG GATCGTAATT TGTCGTTTTA 
ATAATAACGC TGCGGACATC TACATTTTTG 
TTACTCTTTC TTTTTCTCCA TATTGACCAT 
GTAGATTTCC CGGACATGAA GCCATTTACA 
GCTGCCGCTT TGCACCCGGT GGAGCTTGCA 
TGAGCCGGTT AGGCAGATAA TTTCCATTGA 
TCCCCCCAAC ACGGTGAGCG ACGGGGCAAC 
TTTCCTAGCT TGGCTGCCAT TTTTGGGGTG 
CAGCTGGGGG GATGGGAGGC CCGCGTTACC 
GGCACCCCCC TTCGGCGTGC GCGGTCACGC 
AAAAACAAGG TTTATAAATA TTGGTTTAAA 
AGCGGTGGCC GAAAAACGGG CGGAAACCCT 
CCTGTGGACA GCCCCTCAAA TGTCAATAGG 
ACTCTGCCCC TCAAGTGTCA AGGATCGCGC 
CGCCCCTCAA GTGTCAATAC CGCAGGGCAC 
ATCATCTGTG GGAAACTCGC GTAAAATCAG 
GGCTGGCCAG CTCCACGTCG CCGGCCGAAA 
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5651 
5701 
5751 
5801 
5851 
5901 
5951 
6001 
6051 
6101 
6151 
6201 
6251 
6301 
6351 
6401 
6451 
6501 
6551 
6601 
6651 
6701 
6751 
6801 
6851 
6901 
6951 
7001 
7051 
7101 
7151 
7201 
7251 
7301 
7351 
7401 
7451 
7501 
7551 
7601 
7651 
7701 
7751 
7801 
7851 
7901 
7951 
8001 
8051. 
8101 
8151 
8201 
8251 
8301 
8351 
8401 
8451 
8501 



TCGAGCCTGC 
AAGTGTCAAC 
TGGTATCCAC 
ACGGCGTTTC 
GCGAGGGCAA 
CCTTGAGAGC 
ACTATCGTCG 
ACAGGTGCCG 
GGAGCGCGAC 
GCCCTCGCTC 
GAAGCAGGCC 
TGCTGGCGTT 
CTCGCTTCCG 
GCAGGTAGAT 
TTACCAGCCT 
GCCGCCTCGG 
CCTATACCTT 
CCACCTCGAC 
CTCCAAGAAT 
ACCAACCCTT 
CACGCGGCGC 
TCGTGCTCCT 
GGTTAGCAGA 
CTGCAAAACG 
GTGTTTCGTA 
TTCCGGATCT 
ATCTGTATTA 
CCCGCCGCAT 
CGGGCATGTT 
CATCGGTATC 
TCAAGTGACC 
GAAGCCAGAC 
GAACAGGCAG 
CCGCAGCTGC 
TGCAGCTCCC 
AGACAAGCCC 
AGCCATGACC 
TGCGGCATCA 
TACCGCACAG 
TCCTCGCTCA 
ATCAGCTCAC 
ACGCAGGAAA 
AAAAAGGCCG 
GCATCACAAA 
TATAAAGATA 
GTTCCGACCC 
AAGCGTGGCG 
AGGTCGTTCG 
GACCGCTGCG 
ACACGACTTA 
CGAGGTATGT 
GGCTACACTA 
TACCTTCGGA 
CTGGTAGCGG 
AAAGGATCTC 
GTGGAACGAA 
GGATCTTCAC 
TAAAGTATAT 



CCCTCATCTG 
GTCCGCCCCT 
AACGCCGGCG 
TGGCGCGTTT 
CCAGCCCGGT 
CTTCAACCCA 
CCGCACTTAT 
GCAGCGCTCT 
GATGATCGGC 
AAGCCTTCGT 
ATTATCGCCG 
CGCGACGCGA 
GCGGCATCGG 
GACGACCATC 
AACTTCGATC 
CGAGCACATG 
GTCTGCCTCC 
CTGAATGGAA 
TGGAGCCAAT 
GGCAGAACAT 
ATCTCGGGCA 
GTCGTTGAGG 
ATGAATCACC 
TCTGCGACCT 
AAGTCTGGAA 
GCATCGCAGG 
ACGAAGCGCT 
CCATACCGCC 
CATCATCAGT 
ATTACCCCCA 
AAACAGGAAA 
ATTAACGCTT 
ACATCTGTGA 
CTCGCGCGTT 
GGAGACGGTC 
GTCAGGGCGC 
CAGTCACGTA 
GAGCAGATTG 
ATGCGTAAGG 
CTGACTCGCT 
TCAAAGGCGG 
GAACATGTGA 
CGTTGCTGGC 
AATCGACGCT 
CCAGGCGTTT 
TGCCGCTTAC 
CTTTCTCATA 
CTCCAAGCTG 
CCTTATCCGG 

tcgccactgg 
aggcggtgct 
gaaggacagt 
Aaaagagttg 

TGGTTTTTTT 
AAGAAGATCC 
AACTCACGTT 
CTAGATCCTT 
ATGAGTAAAC 



TCAACGCCGC 
CATCTGTCAG 
GCCGGCCGCG 
GCAGGGCCAT 
GAGCGTCGGA 
GTCAGCTCCT 
GACTGTCTTC 
GGGTCATTTT 
CTGTCGCTTG 
CACTGGTCCC 
GCATGGCGGC 
GGCTGGATGG 
GATGCCCGCG 
AGGGACAGCT 
ACTGGACCGC 
GAACGGGTTG 
CCGCGTTGCG 
GCCGGCGGCA 
CAATTCTTGC 
ATCCATCGCG 
GCGTTGGGTC 
ACCCGGCTAG 
GATACGCGAG 
GAGCAACAAC 
ACGCGGAAGT 
ATGCTGCTGG 
GGCATTGACC 
AGTTGTTTAC 
AACCCGTATC 
TGAACAGAAA 
AAACCGCCCT 
CTGGAGAAAC 
ATCGCTTCAC 
TCGGTGATGA 
ACAGCTTGTC 
GTCAGCGGGT 
GCGATAGCGG 
TACTGAGAGT 
AGAAAATACC 
GCGCTCGGTC 
TAATACGGTT 
GCAAAAGGCC 
GTTTTTCCAT 
CAAGTCAGAG 
CCCCCTGGAA 
CGGATACCTG 
GCTCACGCTG 
GGCTGTGTGC 
TAACTATCGT 
CAGCAGCCAC 
ACAGAGTTCT 
ATTTGGTATC 
GTAGCTCTTG 
GTTTGCAAGC 
TTTGATCTTT 
AAGGGATTTT 
TTAAATTAAA 
TTGGTCTGAC 



GCCGGGTGAG 
TGAGGGCCAA 
GTGTCTCGCA 
AGACGGCCGC 
AAGGGTCGAT 
TCCGGTGGGC 
TTTATCATGC 
CGGCGAGGAC 
CGGTATTCGG 
GCCACCAAAC 
CGACGCGCTG 
CCTTCCCCAT 
TTGCAGGCCA 
TCAAGGATCG 
TGATCGTCAC 
GCATGGATTG 
TCGCGGTGCA 
CCTCGCTAAC 
GGAGAACTGT 
TCCGCCATCT 
CTGGCCACGG 
GCTGGCGGGG 
CGAACGTGAA 
ATGAATGGTC 
CAGCGCCCTG 
CTACCCTGTG 
CTGAGTGATT 
CCTCACAACG 
GTGAGCATCC 
TTCCCCCTTA 
TAACATGGCC 
TCAACGAGCT 
GACCACGCTG 
CGGTGAAAAC 
TGTAAGCGGA 
GTTGGCGGGT 
AGTGTATACT 
GCACCATATG 
GCATCAGGCG 
GTTCGGCTGC 
ATCCACAGAA 
AGCAAAAGGC 
AGGCTCCGCC 
GTGGCGAAAC 
GCTCCCTCGT 
TCCGCCTTTC 
TAGGTATCTC 
ACGAACCCCC 
CTTGAGTCCA 
TGGTAACAGG 
TGAAGTGGTG 
TGCGCTCTGC 
ATCCGGCAAA 
AGCAGATTAC 
TCTACGGGGT 
GGTCATGAGA 
AATGAAGTTT 
AGTTACCAAT 



TCGGCCCCTC 
GTTTTCCGCG 
CACGGCTTCG 
CAGCCCAGCG 
CGACCGATGC 
GCGGGGCATG 
AACTCGTAGG 
CGCTTTCGCT 
AATCTTGCAC 
GTTTCGGCGA 
GGCTACGTCT 
TATGATTCTT 
TGCTGTCCAG 
CTCGCGGCTC 
GGCGATTTAT 
TAGGCGCCGC 
TGGAGCCGGG 
GGATTCACCA 
GAATGCGCAA 
CCAGCAGCCG 
GTGCGCATGA 
TTGCCTTACT 
GCGACTGCTG 
TTCGGTTTCC 
CACCATTATG 
GAACACCTAC 
TTTCTCTGGT 
TTCCAGTAAC 
TCTCTCGTTT 
CACGGAGGCA 
CGCTTTATCA 
GGACGCGGAT 
ATGAGCTTTA 
CTCTGACACA 
TGCCGGGAGC 
GTCGGGGCGC 
GGCTTAACTA 
CGGTGTGAAA 
CTCTTCCGCT 
GGCGAGCGGT 
TCAGGGGATA 
CAGGAACCGT 
CCCCTGACGA 
CCGACAGGAC 
GCGCTCTCCT 
TCCCTTCGGG 
AGTTCGGTGT 
CGTTCAGCCC 
ACCCGGTAAG 
ATTAGCAGAG 
GCCTAACTAC 
TGAAGCCAGT 
CAAACCACCG 
GCGCAGAAAA 
CTGACGCTCA 
TTATCAAAAA 
TAAATCAATC 
GCTTAATCAG 
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8551 
8601 
8651 
8701 
8751 
8801 
8851 
8901 
8951 
9001 
9051 
9101 
9151 
9201 
9251 
9301 
9351 
9401 
9451 
9501 
9551 
9601 
9651 
9701 
9751 
9801 
9851 
9901 
9951 
10001 
10051 
10101 
10151 
10201 
10251 
10301 
10351 
10401 
10451 
10501 
10551 
10601 
10651 
10701 
10751 
10801 
10851 
10901 



TGAGGCACCT 
GACTCCCCGT 
CCCAGTGCTG 
ATCAGCAATA 
CAACTTTATC 
GTAAGTAGTT 
AGGTCGGGAG 
TTCGCGGCGC 
TCGCCGAAGT 
CATCTCGAAC 
TGGCGGCCTG 
TAAGGCTTGA 
ACTTCGGCTT 
CATTGTTGTG 
AACTGCAATT 
GAGCCAGCCA 
AGAACATAGC 
CGGTTCCTGA 
TGGAACTCGC 
GTTGTCCCGC 
ATGTCGCTGC 
GTCATACTTG 
CTCGCGCGCA 
TCACCAAGGT 
CGCTTCGCGG 
TGTCACGCTC 
TCAAGGCGAG 
CTTCGGTCCT 
GCGTTGAGGG 
ACGAGCCAAG 
CGTTTGGGTG 
CCGAGGGGAA 
AACCTTTTCA 
CTCTTAGGTT 
TGAAGGCGGG 
TAGCAGCATT 
TTATTCAAAT 
GTTTGTAAGG 
GAGTCTCCAA 
CAATCAAAGT 
CAGAAAAAGA 
TAATCTTGTC 
AATGGTGCAG 
TGCAAAGATA 
GGAAAAGAGC 
GTGACGACCA 
TTGAAGGATC 



ATCTCAGCGA 
CGTGTAGATA 
CAATGATACC 
AACCAGCCAG 
CGCCTCCATC 
CGCCAGTTAA 
CACAGGATGA 
GGCTTAATTC 
ATCGACTCAA 
CGACGTTGCT 
AAGCCACACA 
TGAAACAACG 
CCCCTGGAGA 
CACGACGACA 
TGGAGAATGG 
CGATCGACAT 
GTTGCCTTGG 
ACAGGATCTA 
CGCCCGACTG 
ATTTGGTACA 
CGACTGGGCA 
AAGCTAGGCA 
GATCAGTTGG 
AGTCGGCAAA 
CGCGGCTTAA 
GTCGTTTGGT 
TTACATGATC 
CCGATCGAGG 
ATCAAGCCAC 
GGATCTTTTT 
GTTGAACAGA 
CCCTGTGGTT 
CGCCCTTTTA 
TACCCGCCAA 
AAACGACAAT 
CCAGATTGGG 
TGGTATCGCC 
AAGAATTCTC 
ACCATTAGCC 
AAACTACTGT 
CATCCACCGA 
AACATCGAGC 
AATTGTTAGG 
AAGCAGATTC 
TGTCCTGACA 
CAAAAGAATT 
ATCAGATACT 



TCTGTCTATT 
ACTACGATAC 
GCGAGACCCA 
CCGGAAGGGC 
CAGTCTATTA 
TAGTTTGCGC 
CGCCTAACAA 
AGGAGTTAAA 
CTATCAGAGG 
GGCCGTACAT 
GTGATATTGA 
CGGCGAGCTT 
GAGCGAGATT 
TCATTCCGTG 
CAGCGCAATG 
TGATCTGGCT 
TAGGTCCAGC 
TTTGAGGCGC 
GGCTGGCGAT 
GCGCAGTAAC 
ATGGAGCGCC 
GGCTTATCTT 
AAGAATTTGT 
TAATGTCTAA 
CTCAAGCGTT 
ATGGCTTCAT 
CCCCATGTTG 
ATTTTTCGGC 
AGCAGCCCAC 
GGAATGCTGC 
AGTCATTATC 
GGCATGCACA 
AATATCCGTT 
TATATCCTGT 
CTGATCCCCA 
TTCAATCAAC 
AAAACCAAGA 
AGTCCAAAGC 
AAAAGCTACA 
TCCAGCACAT 
AGACTTAAAG 
AGCTGGCTTG 
CGCACCTACC 
CTCTAGTACA 
GCCCACTCAC 
CCCTCTATAT 
GAACCAATCC 



TCGTTCATCC 
GGGAGGGCTT 
CGCTCACCGG 
CGAGCGCAGA 
ATTGTTGCCG 
AACGTTGTTG 
TTCATTCAAG 
CATCATGAGG 
TAGTTGGCGT 
TTGTACGGCT 
TTTGCTGGTT 
TGATCAACGA 
CTCCGCGCTG 
GCGTTATCCA 
ACATTCTTGC 
ATCTTGCTGA 
GGCGGAGGAA 
TAAATGAAAC 
GAGCGAAATG 
CGGCAAAATC 
TGCCGGCCCA 
GGACAAGAAG 
TCACTACGTG 
CAATTCGTTC 
AGATGCTGCA 
TCAGCTCCGG 
TGCAAAAAAG 
GCTGCGCTAC 
TCGACCTCTA 
TCCGTCGTCA 
GTACGGAATG 
TACAAATGGA 
ATTCTAATAA 
CAAACACTGA 
TCAAGCTTGA 
AAGGTACGAG 
AGGAACTCCC 
CTCAACAAGG 
GGAGATCAAT 
GCATCATGGT 
TTAGTGGGCA 
TGGGGACCAG 
AAAAGCATCT 
AGTGGGGAAC 
TAATGCGTAT 
AAGAAGGCAT 
TTCTAGAAGA 



ATAGTTGCCT 
ACCATCTGGC 
CTCCAGATTT 
AGTGGTCCTG 
GGAAGCTAGA 
CCATTGCTGC 
CCGACACCGC 
GAAGCGGTGA 
CATCGAGCGC 
CCGCAGTGGA 
ACGGTGACCG 
CCTTTTGGAA 
TAGAAGTCAC 
GCTAAGCGCG 
AGGTATCTTC 
CAAAAGCAAG 
CTCTTTGATC 
CTTAACGCTA 
TAGTGCTTAC 
GCGCCGAAGG 
GTATCAGCCC 
ATCGCTTGGC 
AAAGGCGAGA 
AAGCCGACGC 
GGCATCGTGG 
TTCCCAACGA 
CGGTTAGCTC 
GTCCGCKACC 
GCCGACCCAG 
GGCTTTCCGA 
CCAAGCACTC 
CGAACGGATA 
ACGCTCTTTT 
TAGTTTAAAC 
GCTCAGGATT 
CCATATCACT 
ATCCTCAAAG 
TCAGGGTACA 
GAAGAATCTT 
CAGTAAGTTT 
TCTTTGAAAG 
ACAAAAAAGG 
TTGCCTTTAT 
AAAATAACGT 
GACGAACGCA 
TCATTCCCAT 
TCTAAGCTTA 
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CLAIMS 

1 . A recombinant, double-stranded DNA molecule containing 
a) a promoter functional in plant cells, and 

5 b) a DNA sequence coding for a polypeptide having the enzymatic activity of 

a fructose- 1,6-bisphosphate aldolase and operatively linked to the promoter 
in sense orientation. 

2. The DNA molecule according to claim 1 , wherein the DNA sequence 
coding for a polypeptide having the enzymatic activity of a fructose- 1,6- 

10 bisphosphate aldolase is derived from a prokaryotic organism. 

3. The DNA molecule according to claim 2, wherein the prokaryotic organism is 
Escherichia coli. 

4. The DNA molecule according to claim 1, wherein the DNA sequence coding for a 
polypeptide having the enzymatic activity of a fructose- 1 ,6-bisphosphate aldolase 

15 has at least about 60% identity with a prokaryotic DNA sequence coding for 

fructose- 1,6-bisphosphate aldolase class II. 

5. The DNA molecule according to claim 1 , wherein the DNA sequence coding for 
the polypeptide having the enzymatic activity of a fructose- 1,6-bisphosphate 
aldolase is a sequence capable of hybridizing with the coding region depicted as 

20 SEQIDNO. 1. 

6. The DNA molecule according to claim 1 , wherein the DNA sequence coding for a 
polypeptide having the enzymatic activity of a fructose- 1,6-bisphosphate aldolase 
has at least about 60% identity with the coding region depicted as SEQ ID NO. 1 . 

7. The DNA molecule according to claim 1 , wherein the DNA sequence coding for a 
25 polypeptide having the enzymatic activity of a fructose- 1 ,6-bisphosphate aldolase 

has at least about 70% identity with the coding region depicted as SEQ ID NO. 1. 
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8. The DNA molecule according to claim 1 , wherein the DNA sequence coding for a 
polypeptide having the enzymatic activity of a fructose- 1,6-bisphosphate aldolase 
has at least about 80% identity with the coding region depicted as SEQ ID NO. 1 . 

9. The DNA molecule according to claim 1, wherein the DNA sequence coding for 
5 the polypeptide having the enzymatic activity of a fructose- 1 ,6-bisphosphate 

aldolase has the coding region depicted as SEQ ID NO. 1, or encodes the same 
peptide as SEQ ID NO. 1 in accordance with the degeneracy of the genetic code. 

10. A transgenic plant cell containing in its genome a recombinant DNA molecule 
according to any of claims 1-9. 

10 11. A transgenic plant containing plant cells according to claim 10. 

12. The transgenic plant of claim 1 1 , wherein the plant exhibits a property selected 
from the group consisting of increased photosynthesis rates, increased yields, 
increased growth rates and improved solids uniformity compared with plants that 
do not contain the recombinant DNA molecule. 

15 13. The transgenic plant according to claim 1 1 , which is a crop plant. 

14. The transgenic plant according to claim 1 1 , selected from the group consisting of 
corn, wheat, rice, tomato, potato, carrots, sweet potato, yams, artichoke, alfalfa, 
peanut, barley, cotton, soybean, canola, sunflower, sugarbeet, apple, pear, orange, 
peach, sugarcane, strawberry, raspberry, banana, grape, plantain, tobacco, lettuce, 

20 cassava, cruciferous vegetables, forestry species and horticultural species. 

15. The transgenic plant of claim 11, wherein the plant is a potato. 

16. A food product derived from the potato of claim 15. 

17. The food product of claim 1 6, which is a french fry or a potato chip. 
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18. Propagation material derived from the transgenic plant of claim 1 1 . 

19. A process for increasing the photosynthesis rate in plants which comprises 
transforming plant cells with a DNA molecule according to any one of claims 1 to 
9, and regenerating the transformed cells to produce a transgenic plant. 

20. A process for increasing the yield in plants which comprises transforming plant 
cells with a DNA molecule according to any one of claims 1 to 9, and regenerating 
the transformed cells to produce a transgenic plant. 

21 . A process for increasing the growth rate in plants which comprises transforming 
plant cells with a DNA molecule according to any one of claims 1 to 9, and 
regenerating the transformed cells to produce a transgenic plant. 

22. A process for improving the solids uniformity in plants which comprises 
transforming plant cells with a DNA molecule according to any one of claims 1 to 
9, and regenerating the transformed cells to produce a transgenic plant. 

23. In a method for the processing of potatoes into fries or chips, the improvement 
comprising, utilizing a potato that overexpresses the fda transgene providing a 
higher solids uniformity in such potato. 
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TGCAACTTGAAGTATGACGAGTATAAGGCCCGACGATACAGGACAAGAGACATGTCT 

AAG 

MetSerLys 



ATTTTTGATTTCGTAAAACCTGGCGTAATCACTGGTGATGACGTACAGAAAGTTTTCCAG 
IlePheAspPheValLysProGlyVallleThrGlyAspAspValGlnLysValPheGln 



GTAGCAAAAGAAAACAACTTCGCACTGCCAGCAGTAAACTGCGTCGGTACTGACTCCATC 
ValAlaLysGluAsnAsnPheAlaLeuProAlaValAsnCysValGlyThrAspSerlle 



AACGCCGTACTGGAAACCGCTGCTAAAGTTAAAGCGCCGGTTATCGTTCAGTTCTCCAAC 
AsnAlaValLeuGluThrAlaAlaLysValLysAlaProVallleValGlnPheSerAsn 



GGTGGTGCTTCCTTTATCGCTGGTAAAGGCGTGAAATCTGACGTTCCGCAGGGTGCTGCT 
GlyGlyAlaSerPhelleAlaGlyLysGlyValLysSerAspValProGlnGlyAlaAla 



ATCCTGGGCGCGATCTCTGGTGCGCATCACGTTCACCAGATGGCTGAACATTATGGTGTT 
IleLeuGlyAlalleSerGlyAlaHisHisValHisGlnMetAlaGluHisTyrGlyVal 



CCGGTTATCCTGCACACTGACCACTGCGCGAAGAAACTGCTGCCGTGGATCGACGGTCTG 
ProVallleLeuHisThrAspHisCysAlaLysLysLeuLeuProTrpIleAspGlyLeu 



LeuAspAlaGlyGluLysHisPheAlaAlaThrGlyLysProLeuPheSerSerHisMet 



ATCGACCTGTCTGAAGAATCTCTGCAAGAGAACATCGAAATCTGCTCTAAATACCTGGAG 
IleAspLeuSerGluGluSerLeuGlnGluAsnlleGluIleCysSerLysTyrLeuGlu 



CGCATGTCCAAAATCGGCATGACTCTGGAAATCGAACTGGGTTGCACCGGTGGTGAAGAA 
ArgMetSerLysIleGlyMetThrLeuGluIleGluLeuGlyCysThrGlyGlyGluGlu 



GACGGCGTGGACAACAGCCACATGGACGCTTCTGCACTGTACACCCAGCCGGAAGACGTT 
AspGlyValAspAsnSerHisMetAspAlaSerAlaLeuTyrThrGlnProGluAspVal 



GATTACGCATACACCGAACTGAGCAAAATCAGCCCGCGTTTCACCATCGCAGCGTCCTTC 
AspTyrAlaTyrThrGluLeuSerLysIleSerProArgPheThrlleAlaAlaSerPhe 
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GGTAACGTACACGGTGTTTACAAGCCGGGTAACGTGGTTCTGACTCCGACCATCCTGCGT 
GlyAsnValHisGlyValTyrLysProGlyAsnValValLeuThrProThrlleLeuArg 



GATTCTCAGGAATATGTTTCCAAGAAACACAACCTGCCGCACAACAGCCTGAACTTCGTA 
AspSerGlnGluTyrValSerLysLysHisAsnLeuProHisAsnSerLeuAsnPheVal 



TTCCACGGTGGTTCCGGTTCTACTGCTCAGGAAATCAAAGACTCCGTAAGCTACGGCGTA 
PheHisGlyGlySerGlySerThrAlaGlnGluIleLysAspSerValSerTyrGlyVal 



GTAAAAATGAACATCGATACCGATACCCAATGGGCAACCTGGGAAGGCGTTCTGAACTAC 
ValLysMetAsnlleAspThrAspThrGlnTrpAlaThrTrpGluGlyValLeuAsnTyr 



TACAAAGCGAACGAAGCTTATCTGCAGGGTCAGCTGGGTAACCCGAAAGGCGAAGATCAG 
TyrLysAlaAsnGluAlaTyrLeuGlnGlyGlnLeuGlyAsnProLysGlyGluAspGln 



CCGAACAAGAAATACTACGATCCGCGCGTATGGCTGCGTGCCGGTCAGACTTCGATGATC 
ProAsnLysLysTyrTyrAspProArgValTrpLeuArgAlaGlyGlnThrSerMetlle 



GCTCGTCTGGAGAAAGCATTCCAGGAACTGAACGCGATCGACGTTCTGTAAGATATT 

CCT 

AlaArgLeuGluLysAlaPheGlnGluLeuAsnAlalleAspValLeuEnd 
TTCTGCTTATCTCAAGGCCCGCTCTGCGGGTCTTTTTTTCG 
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